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Reference to Government Grant 

This invention was made with government support under a grant from the National 
Institutes of Health, grant numbers PHS CA57973 and AI40034. The government has certain 
rights in this invention. 

Related Applications 

This application claims priority to, and incorporates herein in its entirety, U.S. 
60/082,964 filed April 24, 1998. 

Background of the Invention 

(1) Field of the Invention 

This invention relates generally to the development of therapies for treating hepatitis 
C virus (HCV) and bovine viral diarrhea virus (BVDV) and more particularly to the 
identification of such therapies using chimeric viruses comprising a genomic sequence 
derived from HCV and bovine viral diarrhea virus (BVDV). 

(2) Description of the Related Art 

The Flavivirdae is an important family of human and animal RNA viral pathogens 
(Rice, CM. 1996. Flavivirdae: The viruses and their replication. In: Fields BN, Knipe DM, 
Howley PM., eds. Fields virology. Philadelphia: Lippincott-Raven Publishers, pp. 931-960.) 
The three currently recognized genera of the Flavivirdae family exhibit distinct differences in 
transmission, host range, and pathogenesis. For example, members of the classical flavivirus 
genus, such as yellow fever virus and dengue virus, are typically transmitted to vertebrate 
hosts via arthropod vectors and cause acute self-limiting disease (Monath TP, Heinz FX. 
1996. Flaviviruses. In: Fields BN, Knipe DM, Howley PM., eds. Fields virology. New York: 
Raven Press, pp. 961-1034). The pestiviruses, such as bovine viral diarrhea virus (BVDV) 
and classical swine fever virus (CSFV), cause economically important livestock disease and 
are spread by direct contact or the fecal-oral route (Thiel et ah, 1996. Pestiviruses. In: Fields 
BN, Knipe DM, Howley PM., eds. Fields virology. New York: Raven Press, pp. 1059-1073). 
The most recently characterized Flavivirdae genus is the hepacivirus genus, the sole member 
of which is the common and exclusively human pathogen, hepatitis C virus (HCV). HCV is 
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transmitted by contaminated blood or blood products and is the most common agent of non- 
A, non-B hepatitis, affecting more that 1% of the population worldwide (Houghton, 1996. 
Hepatitis C viruses. In: Fields BN, Knipe DM, Howley PM., eds. Fields virology. 
Philadelphia: Lippincott-Raven Publishers, pp. 1035-1058.). Unlike flavivirus and pestivirus 
5 infections, which are usually eliminated by host immune response, chronic HCV infections 
are common and can cause mild to severe liver disease including cancer. 

Despite these differences, members of the Flavivirdae family share common 
structural features and gene expression strategies. Virus particles consist of a lipid bilayer 
envelope with embedded transmembrane glycoproteins surrounding a protein-RNA 

10 nucleocapsid. Genome RNAs are single-stranded of positive polarity, and function as the sole 
mRNA species for translation of a single long open reading frame (ORF). This ORF is 
translated into a polyprotein which is processed by cellular and viral proteases into mature 
viral proteins. Structural proteins destined for incorporation into virus particles are encoded 
in the N-terminal portion of the polyprotein, while the nonstructural proteins which form 

1 5 components of the viral RN A replicase are encoded in the remainder. 

Replication of the Flavivirdae RNA genome occurs via synthesis of a full-length 
negative-strand intermediate and is asymmetric, favoring synthesis of positive-strand RNAs. 
However, little is known about the details of this process. For all three genera of the 
Flavivirdae family, full-length functional cDNA clones have been constructed and RNAs 

20 transcribed from these cDNA templates are infectious. For flaviviruses arid pestiviruses, 
mutagenesis of these clones and efficient RNA transfection of permissive cell cultures 
provides a means of probing the role of cis RNA elements and viral proteins in replicase 
assembly and function. Such analyses are not yet possible for HCV since this virus is unable 
to replicate efficiently in cell culture. 

25 Like many other RNA viruses, it is believed the 5' and 3' terminal sequences of the 

Flavivirdae contain conserved cw-elements important for translation, RNA replication, and 
packaging (Bukh et al., Proa Natl. Acad. Sci. USA 59:4942-4946, 1992; Deng et al., Nucleic 
Acids Res. 27:1949-1957, 1993; Cahouretal., Virol. 207:68-76, 1995; Kolykhalov et al., J. 
Virol. 70:3363-3371, 1996; Men et al., J. Virol. 70:3930-3937, 1996; Tanaka et al., J. Virol. 

30 70:3307-33 1 2, 1 996; Huang H V. 1 997. Evolution of the alphavirus promoter and the ex- 
acting sequences of RNA viruses. In: Saluzzo J-F, Dodet B. eds. Factors in the emergence of 
arbovirus disesases. Paris: Elsevier Press, pp. 65-79; Mandl et al., J. Virol. 72:2132-2140, 
1998). The 5* nontranslated region (NTR) functions initially at the level of translation. 
Similar to most cellular mRNAs, flavivirus genome RNAs are translated in a cap-dependent 

35 manner. These RNAs contain a 5' cap structure that is presumably added by virus-encoded 
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RNA triphosphatases, guanylyl-, and methyl-transferases (Rice, 1996, supra). In contrast, the 
translational strategy employed by pestiviruses and HCV is more similar to that of the 
* picornaviruses. These RNAs appear to be uncapped and contain long 5' NTRs with cis RNA 
elements that function as internal ribosome entry sites (IRES) for translation initiation at the 
5 polyprotein AUG (Lemon et al., Semin. Virol. 5:274-288, 1997). 

The 5' NTRs of HCV and BVDV have a similar structural and functional organization 
despite containing only short stretches of high sequence identity (Wang et al., Curr. Top. 
Microbiol Immunol 203:99-1 15, 1995; Lemon et al., 1997, supra). The IRES within each 
NTR is located at the 3' end of the NTR at a position proximal to the AUG initiation codon of 

10 the ORF. Although the 5' terminal sequence of each of these viruses is apparently not 

required for IRES function (Rijnbrand et al., FEBS Lett 365:1 15-1 19, 1995; Honda et al., 
Virology. 222:31-42, 1996; Rijnbrand et al., J. Virol 77:451-457, 1997), these sequences are 
highly conserved among different strains of HCV (Bukh et al., Proc. Natl Acad. Sci. 
£/&4:S9:4942-4946, 1992) or BVDV (Deng et al., 1993, supra), suggesting they play other 

15 roles in viral replication. For example, sequences in the 5* NTR may be required for - 
regulating translation versus initiation of negative-strand RNA synthesis. Such regulation 
could occur by direct interaction of 5' and 3' RNA elements or indirectly, via RNA-protein.^ 
interactions. Sequences in the 5 f NTR may also modulate packaging versus translation. 
Finally, sequences complementary to the 5 1 NTR, which are located at the 3' end of negative- 

20 strand RNA, are likely to function in the initiation of positive-strand RNA synthesis. 

The HCV 3 ' NTR contains an internal polypyrimidine tract followed by a highly 
conserved sequence of 98 bases at the 3' terminus, which has been shown to be required for 
replication of HCV (U.S. Application Serial No. 08/81 1,566). 

Further elucidation of the role of sequences in the HCV 5 ' and 3 ' NTRs has been 

25 hampered by the inefficient replication of HCV in cell culture. This aspect of HCV biology 
also makes it difficult to identify and test possible antiviral compounds for activity against 
HCV. Thus, a need exists for a system which facilitates investigation of HCV replication and 
therapeutic approaches to control HCV infections. 

30 Summary of the Invention 

Briefly, therefore, the present invention provides novel compositions and methods for 
studying HCV replication which are based on the discovery that chimeras of HCV and BVDV 
genomic sequences can be constructed that are able to replicate in cell culture. The BVDV- 
specific sequence provides the chimeric viral nucleic acid with the ability to replicate in cell 

35 culture, while the HCV-specific sequence allows the chimeric viral nucleic acid to be used to 
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screen possible compounds for anti-viral activity against HCV. It is believed that similar 
replication-competent chimeras can be constructed from HCV and other pestiviruses. 

Thus, in one embodiment, the present invention provides anovel, chimeric viral RNA 
in which at least one of the 5' NTR; ORF and 3' NTR regions is chimeric and comprises a 
5 nucleotide sequence from the corresponding region of a pestivirus in operable linkage with a 
nucleotide sequence from the corresponding region of an hepatitis C virus (HCV). The 
chimeric viral RNA is replication-competent. In preferred embodiments, the pestivirus is 
BVDV. 

In other embodiments, the invention provides a polynucleotide comprising a DNA- 

10 dependent promoter operably linked to a cDNA of a chimeric viral RNA as described above 
and cells transiently transfected or stably transformed with the polynucleotide. In some 
embodiments the cDNA may encode a dominant selectable marker or an assayable reporter. 

In yet another embodiment, the invention provides a method for identifying 
compounds having anti-HCV activity. The method comprises providing a first cell containing 

15 a chimeric viral nucleic acid derived from HCV and a pestivirus as described above and a 
second cell containing the pestivirus, and then comparing the replication efficiency of the 
chimeric viral nucleic acid in the presence and absence of a test compound to the replication 
efficiency of the pestivirus in the presence and absence of the test compound, 
wherein a greater reduction in compound-induced replication efficiency of the chimeric viral 

20 nucleic acid than the pestivirus indicates the compound has anti-HCV activity. 

The invention also provides a genetically-engineered virus which comprises a 
chimeric viral nucleic acid derived from HCV and a pestivirus as described above. In one 
embodiment the genetically-engineered virus comprises virus particles containing at least one 
HCV structural protein and is useful in a vaccine against HCV. In another embodiment, the 

25 genetically-engineered virus is attenuated as compared to the pestivirus and is useful as a 
vaccine against the pestivirus. 

In a still further embodiment, the invention provides a replication-competent BVDV 
vector expressing a heterologous sequence. The BVDV vector comprises the BVDV 
sequences encoding the BVDV replication machinery. In some embodiments, the replication- 

30 competent BVDV vector expresses an antigen and is useful as a vaccine. 

Brief Description of the Drawings 

Figure 1 is a schematic representation of the 5' NTRs of BVDV, HCV, and EMCV 
showing the position of the start codons of the ORF, and the boxes indicating the canonical 
35 IRES elements. 
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Figure 2 shows a schematic representation of BVDV and HCV chimeras, plaque 
phenotypes, reticulocyte translation efficiencies relative to parental BVDV, specific 
infectivities in MDBK cells, titers at 24 and 48 h post-transfection (or 72 h, as indicated), and 
an indication of whether pseudorevertants arose with results from BVDV, 5 f HCV, 
5 BVDV+HCV, and BVDV+HCVdelB3 chimeras shown in Fig. 2A and results from 
BVDV+HCVdelB2B3, BVDV+HCVdelBlB2B3, BVDV+HCVdelB2B3Hl, and 
BVDV+HCVdelB2B3HlH2 shown in Fig. 2B, where N.D. means not determined. 

Figure 3 illustrates the in vitro translation efficiency of BVDV RNA or chimeras 
showing bar graphs of the amount of N 1 " 10 , the N-terminal protein in the BVDV ORF, 
1 0 expressed by the various constructs. 

Figure 4 illustrates a schematic representation of EMCV chimeras, plaque 
phenotypes, reticulocyte translation efficiencies relative to parental BVDV, specific 
infectivities in MDBK cells, titers at 24 and 48 h post-transfection (or 72 h, as indicated), and 
an indication of whether pseudorevertants arose. 
15 Figure 5 illustrates a pseudorevertant analyses showing in (Fig. 5 A) the relative 

positions of mutations detected within the plaque-purified variants of passaged 
BVDV+HCVdelBlB2B3, 5'EMCV, and 5'HCV, and in (Fig. 5B) the 5' terminal sequences of 
pseudorevertants of BVDV+HCVdelBlB2B3, 5'EMCV, and 5'HCV. Novel nucleotides or 
sequences are shown in bold upper case type. Pseudorevertants are numbered and designated 
20 by the suffix ".R". The upper case sequence in BVDV+HCVdelBlB2B3 and 

BVDV+HCVdelBlB2B3.Rl is a remnant of downstream BVDV 5' NTR sequences and was 
created during the cloning procedures. 

Figure 6 illustrates the construction of derivatives of 5'HCV designed to contain 5* 
termini corresponding to the sequence detected within the three analyzed pseudorevertants. 
25 Fig. 6A shows the 5' terminal sequence of the 5'HCV derivatives with the suffix (orig) 

designating a derivative containing the orig inal 5' terminal sequence of the pseudorevertant; 
the suffix (cons) designating a derivative containing the cons ensus tetranucleotide sequence 
5-GUAU at the same position; and novel sequences shown in bold upper case type. Fig. 6B 
shows plaque phenotypes, reticulocyte translation efficiencies relative to parental BVDV, 
30 specific infectivities in MDBK cells, and titers at 24 and 48 h post-transfection are indicated. 
Figure 7 illustrates a single step growth curve for various chimeric constructs 
showing released virus titers measured by performing plaque assays on MDBK cells 
transfected with various constructs. 

Figure 8 illustrates replication of BVDV RNA or chimeric derivatives in transfected 

35 MDBK cells. Equal numbers of MDBK cells (~ 8 x 10 6 ) were electroporated with 5 Dg of 
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each in vitro synthesized RNA. MDBK cells were also transfected with infectious yellow 
fever 17D and Sindbis RNAs to provide molecular mass markers. One fifth of the transfected 
cells were seeded on 35-mm dishes and incubated in D-MEM supplemented with 10% horse 
serum for 6 h at 37°C. The media were then replaced with 1 ml of fresh media containing 2 
5 g/ml of actinomycin D and 40 Ci/ml of 3 H-uridine. Incubations were continued for 1 0 h at 
37°C. RNAs were isolated as described in Materials and Methods, and 1/4 of the samples 
was denatured in glyoxal and loaded on an agarose gel. (A) Autoradiograph of the dried gel. 
Only the portion of the gel containing the genomic RNAs is shown. (B) Amount of 
radioactivity contained within the displayed fragments as determined by scintillation 
10 counting. BVDV, lane 1; 5'HCV, lane 2; BVDV+HCVdelB2B3, lane 3; 

BVDV+HCVdelB2B3Hl, lane 4; 5'HCV.Rlorig, lane 5; 5'HCV.Rlcons, lane 6; 
S'HCV.RSorig, lane 7; 5'HCV.R3cons, lane 8; 5'HCV.R2orig, lane 9; 5 f HCV.R2cons, lane 10; 
yellow fever 17D, lane 11; Sindbis, lane 12; non-transfected MDBK cells, lane 13. The 
experiments shown is one of two repetitions which yielded similar results. 
1 5 Figure 9 illustrates the genetic map of plasmid pACNR/BUD. 

Figure 10 illustrates the sequence of low copy number plasmid pACNR/BVDV 
NADL (circular) harboring the functional cDNA of cytopathic BVDV NADL (positive sense 
cDNA 5' to 3'; nt 1-12578. 

Figure 1 1 illustrates the sequence of infectious BVDV NADL (positive sense cDNA 

20 5' to 3'). 

Figure 12 illustrates the sequence of infectious non-cytopathic BVDV NADL lacking 
clns (positive sense cDNA 5* to 3')- 

Figure 13 illustrates the sequence adapted HCV 5' NTR from 5'HCV/Rl.cons 
(positive sense cDNA 5' to 3'; only the sequence from the 5' base to the ATG initiating the 
25 polyprotein is shown). 

Figure 14 illustrates the sequence of adapted HCV 5' NTR from 5'HCV/Rl .orig 
(positive sense cDNA 5 f to 3'; only the sequence from the 5' base to the ATG initiating the 

polyprotein is shown). 

Figure 15 illustrates the sequence of adapted HCV 5 'NTR from 5'HCV/R2.cons 
30 (positive sense cDNA 5' to 3'; only the sequence from the 5' base to the ATG initiating the 

polyprotein is shown). 

Figure 1 6 illustrates the sequence of adapted HCV 5 1 NTR from 5'HCV/R2.orig 
(positive sense cNDA 5 1 to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 
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Figure 17 illustrates the sequence of adapted HCV 5' NTR from S'HCV/IU.cons 
(positive sense cDNA 5' to 3'; only the sequence from the 5'base to the ATG initiating the 
polyprotein is shown). 

Figure 1 8 illustrates the sequence of adapted HCV 5 'NTR from 5'HCV/R3.orig 
5 (positive sense cDNA 5 ! to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 19 illustrates the sequence of prototype HCV-BVDV chimera from 
pNADL/S'HRS.orig/GWB with the adapted HCV 5'NTR from S^CV/RJ.orig and tandem 3' 
NTR elements from HCV followed by BVDV (positive sense cDNA 5' to 3') as discussed in 
10 Example 5. 

Figure 20 illustrates various deletions of the poly U track in the 3*NTR HCV 
sequence of BVDV/HCV chimera p5H-3H33 . 

Figure 21 illustrates the schematic representation of functional HCV/-BVDV chimera 
frompCBV/p7. 

15 Figure 22 illustrates the sequence of functional HCV-BVDV chimera from pCBV/p7 

(positive sense cDNA 5' to 3 1 )- 

Figure 23 illustrates the schematic representation of a HCV7BVDV chimera with 
selectable marker. . . 

Figure 24 illustrates the sequence of functional HCV-BVDV chimera from 
20 pCBV/p7/IRES-pac expressing a dominant selectable marker conferring resistance to 
puromycin (positive sense cDNA 5' to 3'). 

Figure 25 illustrates the schematic representation of a bicistronic HCV/BVDV 
chimera. 

Figure 26 illustrates the sequence of functional bicistronic chimera expressing the 
25 entire HCV structural region derived from plasmid pNADL/BI#41/HCV str (positive sense 
cDNA 5' to 3') 

Description of the Preferred Embodiments 

In accordance with the present invention, the inventors herein have succeeded in 

30 generating HCV-BVDV chimeric RNAs which are replication competent. Such chimeras are 
useful in screening compounds in vitro for antiviral activity against HCV. In addition, it is 
believed that in vivo replication of HCV-BVDV chimeras according to the invention may be 
attenuated as compared to wild-type BVDV and thus may be useful in vaccinating animals 
against BVDV. It is also believed that the HCV chimeric structures described herein for 

35 BVDV are applicable to other pesti viruses. 
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In the context of this disclosure, the following terms will be defined as follows unless 
otherwise indicated: 

"Cis-acting sequences" means the nucleotide sequences from an RNA virus genome 
that are necessary for recognition of the genomic RNA by specific protein(s) of the RNA 
5 virus or host cell that carry out replication, transcription, translation or packaging of the 
genome. 

"Genetically-engineered virus" means any virus whose genome is different than that 
of a wild-type virus due to a human-made deletion, insertion, or substitution of one or more 
nucleotides to the wild-type viral genome. 
10 "Infectious" when used to describe a virus means the virus is capable of entering cells 

and initiating a virus replication cycle, whether or not this leads to the production of new 
RNA virus particles. 

"Nucleotide sequence" as used herein refers to DNA and the corresponding RNA 
sequence where relevant. It will be understood that sequences shown in the Figures are DNA 
15 versions of the RNA sequence and that chimeric molecules of the invention may comprises 
RNA molecules or cDNA copies of such RNA molecules. 

"Replication-competent" as applied to a chimeric HCV-pestivirus RNA means the 
RNA is capable of RNA-dependent replication in at least one cell type that supports 
replication of the wild-type parental pestivirus. The number of replicated RNA molecules 
20 produced by an HCV-pestivirus chimeric RNA of the invention is at least 10-fold higher than 
the limit of detection, which is typically 10 to 100 molecules. More preferably, chimeric 
RNA production by the HCV-pestivirus chimeric RNA is at least 10 2 to 10 3 -fold higher than 
the detection limit. The replication-competent chimeric RNA replicates at an efficiency that 
is preferably, at least 0.001%, more preferably, at least 0.01%, more preferably, at least 0.1%, 
25 more preferably, at least 1%, more preferably at least 10% and most preferably at least 50% 
up to 90% that of the parental pestivirus in the same cell type. 

"Transfected cell" means a cell containing an exogenously introduced nucleic acid 
molecule, and includes cells that are transiently transfected with the exogenous nucleic acid. 
"Transformed cell" or "stably transformed cell" means a cell containing an 
30 exogenously introduced nucleic acid molecule which is present in the cytoplasm or nucleus of 
the cell and may be stably integrated into the chromosomal DNA of the cell. 
"Virus" means a virion, virus particle or a viral genome. 

A chimeric viral RNA according to the invention is designed to comprise a 5' NTR, 
an ORF, and a 3 ' NTR, at least one of which is a chimeric region containing two operably 
35 linked nucleotide sequences that are from the same region of a pestivirus and an HCV. 
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Pesti virus-specific sequences useful in the invention can be taken from the appropriate 
genomic region of any cytopathic or noncytopathic type I or type II BVDV isolate, classical 
swine fever virus (CSFV) isolate, or border disease viral isolate. For a list of pesti viruses , 
see Thiel, H.-J., P. G. W. Plagemann, and V. Moennig. 1996. Pestiviruses, p. 1059-1073. In 
5 B. N. Fields, D. M. Knipe and P. M. Howley (ed.), Fields Virology. Raven Press, New York. 
HCV-specific sequences can be taken from any strain or isolate of HCV, including but not 
limited to HCV-1, HCV-la, HCV-lb, HCV-lc, HCV-2a, HCV-2b, HCV-2c, HCV-3a . 
Preferably, the parental pestivirus is a cytopathic strain of BVDV and the parental HCV strain 
is HCV-1. 

10 The pestivirus- and HCV-specific sequences are operably linked in the chimeric 

region, meaning the sequences are arranged such that the resulting chimeric structure is 
functional in the context of replication of the pestivirus. For example, in one preferred 
embodiment the chimeric viral RN A comprises a chimeric 5 ' NTR which comprises a 
BVDV-specific 5' terminal sequence of 5'-(G/A)UAU and an IRES derived from HCV, with 

15 the ORF and the 3' NTR consisting of a sequence from the same regions of BVDV. The 

B VDV-specific sequences at the 5 ' terminus and in the ORF and 3 ' NTR are chosen such that 
they are functional in the context of BVDV, meaning the chimeric viral RNA expresses the ^ 
replication machinery of BVDV and this replication machinery is capable of replicating the 
chimeric RNA. In addition, translation of the BVDV ORF in the chimeric viral RNA is 

20 dependent upon a functional HCV IRES. The presence of a functional HCV IRES in this r 
chimera allows the chimera to be used to screen for compounds that target the HCV IRES and 
thereby inhibit translation of the BVDV ORF as well as replication of the chimeric virus. 
Such compounds would be expected to also inhibit translation of the ORF in a wild-type HCV 
and consequently inhibit HCV replication. 

25 Compounds that could be screened for anti-HCV activity using this and other HCV- 

BVDV 5 ' NTR chimeras include but are not limited to antisense RNAs, RNA decoys that 
bind proteins involved in recognition of the HCV-specific sequences, ribozymes, and small 
molecule inhibitors of critical RNA-protein interactions. The use of such substances for 
therapeutic applications are known in the art. See, e.g., Amarzguioui M, et al., "Hammerhead 

30 ribozyme design and application." Cell Mol Life Sci. 1998 Nov;54(ll):l 175-202; Welch PJ, 
et al., "Expression of ribozymes in gene transfer systems to modulate target RNA levels.", 
Curr Opin BiotechnoL 1998 Oct;9(5):486-96; Bramlage B, et al. "Designing ribozymes for 
the inhibition of gene expression."; Trends BiotechnoL 1998 Oct;16(10):434-8; Gewirtz AM, 
et al. "Nucleic acid therapeutics: state of the art and future prospects."; Blood. 1998 Aug 

35 1 ;92(3):71 2-36; Altman S., "RNase P in research and therapy." Biotechnology (N Y). 1995 
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Apr;13(4):327-9; Flanagan WM., "Antisense comes of age."; Cancer Metastasis Rev. 1998 
Jun; 17(2): 169-76; Agrawal S, et al., "Antisense therapeutics." Curr Opin Chem Biol 1998 
Aug;2(4):5 19-28; Caselmann WH, et al., "Synthetic antisense oligodeoxynucleotides as 
potential drugs against hepatitis C." Intervirology 1997;40(5-6):394-9; Neckers LM., 
5 "Oligodeoxynucleotide inhibitors of function: mRNA and protein interactions." Cancer J Sci 
Am. 1998 May;4 Suppl l:S35-42; Agrawal S, et al. "Mixed backbone oligonucleotides: 
improvement in oligonucleotide-induced toxicity in vivo." Antisense Nucleic Acid Drug Dev. 
1998 Apr;8(2): 135-9; Crooke ST., "An overview of progress in antisense therapeutics." 
Antisense Nucleic Acid Drug Dev. 1998 Apr;8(2):l 15-22; Fraisier C, et al., "High level 

1 0 inhibition of HTV replication with combination RNA decoys expressed from an HIV-Tat 
inducible vector."; Gene Ther. 1998 Dec;5( 12): 1665-76; Gervaix A, et al. "Gene therapy 
targeting peripheral blood CD34+ hematopoietic stem cells of HIV-infected individuals." 
Hum Gene Ther 1997 Dec 10;8(18):2229-38; Nakaya T, et al. "Inhibition of HIV-1 
replication by targeting the Rev protein." Leukemia 1997 Apr; 1 1 Suppl 3: 134-7; Nakaya T, et 

1 5 al. "Decoy approach using RNA-DNA chimera oligonucleotides to inhibit the regulatory 
function of human immunodeficiency virus type 1 Rev protein." Antimicrob Agents 
Chemother. 1997 Feb;41(2):319-25; Smith C, et al. "Transient protection of human T-cells 
from human immunodeficiency virus type 1 infection by transduction with adeno-associated 
viral vectors which express RNA decoys." Antiviral Res. 1996 Oct;32(2):99-l 15; Bahner I, et 

20 al. "Transduction of human CD34+ hematopoietic progenitor cells by a retroviral vector 
expressing an RRE decoy inhibits human immunodeficiency virus type 1 replication in 
myelomonocytic cells produced in long-term culture." J Virol. 1996 Jul;70(7):4352-60; Lee 
SW, et al. "Inhibition of human immunodeficiency virus type 1 in human T cells by a potent 
Rev response element decoy consisting of the 13-nucleotide minimal Rev-binding domain." J 

25 Virol. 1994 Dec;68(12):8254-64; Lisziewicz J, et al. "Inhibition of human immunodeficiency 
virus type 1 replication by regulated expression of a polymeric Tat activation response RNA 
decoy as a strategy for gene therapy in AIDS." Proc Natl Acad Sci USA. 1993 Sep 
1; 90(1 7): 8000-4; Bevec D, et al. "Inhibition of human immunodeficiency virus type 1 
replication in human T cells by retroviral-mediated gene transfer of a dominant-negative Rev 

30 trans-activator." Proc Natl Acad Sci USA. 1992 Oct 15;89(20):9870-4. 

It is contemplated that a number of replication-competent chimeric structures can be 
made that allow the function of various HCV sequence elements and proteins to be studied 
and targeted in drug screening assays. For example, the invention includes replication- 
competent HCV-pestivirus chimeras having a chimeric ORF. One such chimeric ORF is one 

35 comprising an HCV sequence encoding the structural proteins and a pesti virus sequence 
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encoding the nonstructural proteins. It is believed that upon introduction into a cell, such a 
HCV-BVDV ORF chimera will produce HCV-like virus particles that will be released from 
the cell and capable of infecting cells normally infected by wild-type HCV, i.e., cells 
expressing an HCV receptor such as human CDS 1 . Such ORF chimeras would be useful to 
5 screen compounds for drugs that inhibit formation, release or entry of HCV particles. In 
addition, ORF chimeras that produce virus particles containing at least one HCV structural 
protein would be useful as vaccines against HCV. Other ORF chimeras contemplated by the 
invention include, for example, chimeras comprising a pestivirus sequence encoding 
structural proteins and an HCV sequence encoding one or more nonstructural proteins such as 

1 0 the NS3 protease, NS4A cofactor, NS5 A phosphoprotein/interferon resistance determinant 
and/or the NS5B polymerase. Replication of such ORF chimeras would be dependent upon 
the function of the HCV nonstructural protein(s) and these ORF chimeras could be used to 
screen for drugs that target the HCV nonstructural protein(s) as well as to screen for and map 
potential drug resistance mutations in HCV nonstructural proteins. In addition, HCV- 

15 pestivirus ORF chimeras could be useful for developing alternative in vivo animal models for 
HCV replication and HCV-associated hepatocellular carcinoma to evaluate antivirals and 
anti -tumor agents. 

The invention also provides replication-competent HCV-pestivirus chimeras having a 
chimeric 3 ' NTR which contains one or more conserved elements of the HCV 3 ' NTR. Such 

20 3' NTR chimeras would be useful for screening or evaluating compounds targeted against the 
HCV 3 ' NTR. Compounds that could be screened include antisense RNA molecules, , 
ribozymes and small molecule inhibitors of critical RNA-protein interactions. One 3 ' NTR 
chimera according to the invention comprises a BVDV 5' NTR, BVDV ORF and a chimeric 
3 ' NTR which consists of an HCV-specific sequence derived from the HCV 3 ' NTR 

25 immediately followed by a BVDV 3 ' NTR. The HCV-specific 3 ' NTR that allows for 

replication in the context of BVDV has a deletion in the 3' NTR poly (U) tract but has all the 
other HCV 3 ' NTR elements, including the 98 bp 3 ' terminal conserved element. 

HCV-pestivirus chimeras included within the scope of the invention include those 
comprising combinations of chimeric regions, i.e., 5' NTR and ORF chimeras; 5' NTR and 3' 

30 NTR chimeras; ORF and 3' NTR chimeras; and chimeric RNAs in which each of the 5' NTR, 
ORF and 3' NTR regions comprise an HCV sequence operably linked to a pestivirus 
sequence. 

The invention also provides chimeric RNAs having two ORFs, or bicistronic HCV- 
pestivirus chimeras. Bicistronic chimeras contemplated by the invention include structures in 
35 which the first ORF contains one or more HCV genes and is followed by a second IRES 
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operably linked to a second ORF encoding the pestivirus replicase machinery. It is also 
contemplated the first ORF may encode a heterologous sequence such as an antigen. 

It is believed that many HCV-pestivirus chimeras of the invention-will be attenuated - 
as compared to the parental wild-type pestivirus. Such attenuated chimeric RNA genomes 
5 would be candidate vaccines in the form of live-attenuated virus particles or as RNA or 
cDNA "genetic" vaccines. 

The invention also includes vaccines against HCV which comprise an 
immunogenically-effective amount of HCV-pestivirus particles or nucleic acid. Anti-HCV 
vaccines comprising virus particles should preferably contain one or more HCV structural 
10 proteins. 

The therapeutic or pharmaceutical compositions of the present invention can be 
administered by any suitable route known in the art including for example by injection such 
as intraperitoneal, intravenous, subcutaneous, intramuscular, transdermal, intrathecal or 
intracerebral injection. Administration can be either rapid as by injection or over a period of 

1 5 time as by slow infusion or administration of slow release formulation. 

Compositions according to the invention can be employed in the form of 
pharmaceutical or veterinary preparations. Such preparations are made in a manner well 
known in the pharmaceutical and veterinary arts. One preferred preparation utilizes a vehicle 
of physiological saline solution, but it is contemplated that other pharmaceutically acceptable 

20 carriers such as physiological concentrations of other non-toxic salts, five percent aqueous 
glucose solution, sterile water or the like may also be used. It may also be desirable that a 
suitable buffer be present in the composition. Such solutions can, if desired, be lyophilized 
and stored in a sterile ampoule ready for reconstitution by the addition of sterile water for 
ready injection. The primary solvent can be aqueous or alternatively non-aqueous. 

25 The carrier can also contain other pharmaceutically-acceptable excipients for 

modifying or maintaining the pH, osmolarity, viscosity, clarity, color, sterility, stability, rate 
of dissolution, or odor of the formulation. Similarly, the carrier may contain still other 
pharmaceutically-acceptable excipients for modifying or maintaining release or absorption or 
penetration across the blood-brain barrier. Such excipients are those substances usually and 

30 customarily employed to formulate dosages for parenteral administration in either unit dosage 
or multi-dose form or for direct infusion into the cerebrospinal fluid by continuous or periodic 
infusion. 

It is also contemplated that certain formulations containing a chimeric virus according 
to the invention are to be administered orally. Such formulations are preferably encapsulated 
35 and formulated with suitable carriers in solid dosage forms. Some examples of suitable 
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carriers, excipients, and diluents include lactose, dextrose, sucrose, sorbitol, mannitol, 
starches, gum acacia, calcium phosphate, alginates, calcium silicate, microcrystalline 
cellulose, polyvinylpyrrolidone, cellulose, gelatin, syrup, methyl cellulose, methyl- and 
propylhydroxybenzoates, talc, magnesium, stearate, water, mineral oil, and the like. The 
5 formulations can additionally include lubricating agents, wetting agents, emulsifying and 
suspending agents, preserving agents, sweetening agents or flavoring agents. The 
compositions may be formulated so as to provide rapid, sustained, or delayed release of the 
active ingredients after administration to the patient by employing procedures well known in 
the art. The formulations can also contain substances that diminish proteolytic degradation 
10 and promote absorption such as, for example, surface active agents. 

The specific dose is calculated according to the approximate body weight or body 
surface area of the patient or the volume of body space to be occupied. The dose will also be 
calculated dependent upon the particular route of administration selected. Such calculations 
can be made without undue experimentation by one skilled in the art. Exact dosages are 
15 determined in conjunction with standard dose-response studies. It will be understood that the 
amount of the composition actually administered will be determined by a practitioner, in the 
light of the relevant circumstances including the condition or conditions to be treated, the ^ 
choice of composition to be administered, the age, weight, and response of the individual 
patient, the severity of the patient's symptoms, and the chosen route of administration. Dose. 
20 administration can be repeated depending upon the pharmacokinetic parameters of the dosage 
formulation and the route of administration used. 

Replication-competent HCV-pestiviruses are generated by choosing the HCV 
function or sequence element desired to be studied. The HCV sequence can be obtained from 
a plasmid clone of a partial or full HCV genome using PCR to amplify a target region 
25 containing the desired sequence or by restriction enzyme digestion. The HCV fragment is 
then inserted into the desired location of a clone of the pestivirus genome using standard 
techniques. Desired portions of the pestivirus genome may be deleted before or after addition 
of the HCV fragment. The recombinant genome is then transfected into a cell that supports 
replication of the parental pestivirus genome and their ability to replicate using standard 
30 assays. For example, replication can be assessed by virus-induced cytopathic effect; plaque 
formation; detection of viral antigens and/or viral RNA accumulation; and by plaque assay 
measuring released infectious virus. The inventors herein have found that the BVDV RNA 
replication machinery works in many cell types, including bovine, hamster, mouse and human 
cells. It has also been reported that BVDV RNAs can amplify in other cell types including 
35 human hepatoma lines and hepatocytes (Behrens SE, et al., J Virol 1998 Mar;72(3):2364-72). 
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The host cell range for a particular chimera will be dependent upon the properties of that 
chimera as empirically determined. 

As described below, some chimeras do not replicate stably as indicated by 
heterogeneity in the size of plaques produced by the chimeric virus. Upon passage, 
5 pseudorevertants can frequently be isolated that are capable of stable replication. Such 
pseudorevertants will have one or more deletions or base substitutions in the HCV and/or 
pestivirus sequences. Information derived from these gain-of-function mutations can be used 
to define the elements necessary for generating stable, replication-competent chimeras of 
HCV and a pestivirus. 

1 0 The invention provides a method for screening compounds for antiviral activity 

against HCV. The method involves comparing a test compound's effect on replication of a 
chimeric HCV-pestivirus RNA molecule as described above with the compound's effect on 
replication of the parental pestivirus. Compounds which have a greater effect on replication 
of the chimeric virus than the pestivirus are likely directed against the HCV portion of the 

1 5 chimera. Typically, the method is performed by providing duplicate cell cultures containing a 
chimeric viral RNA which is replication-competent in that cell, treating one of the culture 
with the test compound, and then measuring the replication efficiency of the chimeric RNA in 
both cultures. Any effect induced by the compound is compared against the compound's 
effect on replication of the parental pestivirus in cells of the same type. This control assay is 

20 preferably performed at the same time using the same culture conditions. 

The cells used in the screening assay can be prepared by transiently transfecting the 
cells with the desired chimeric RNA molecule as described below. Alternatively, it is 
contemplated that the chimeric RNA molecule can be constitutively expressed in the cell by 
transfecting the cell with a polynucleotide comprising a cDN A of the chimeric RNA operably 

25 linked to a DNA-dependent promoter. The chimeric cDNA may include a selectable marker, 
which would allow for selection of cells expressing the chimeric RNA. It is also envisioned 
the selectable marker could be a dominant marker that allows selection of cells expressing 
chimeras having adaptive mutations or selection of cells permissive for virus replication 
(Frolov et al., J. Virol 73:3854-3865, 1999). It is also contemplated the cDNA could express 

30 a reporter gene that could be assayed to measure RNA replication. 

Alternatively, chimeric virus particles are incubated with a cell permissive for 
infection by the pestivirus in the presence or absence of the test compound and then 
replication of the chimeric virus is measured and compared to the replication of the parental 
pestivirus incubated with the same cell type in the presence or absence of the test compound. 
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Inhibition of replication can be measured in many ways, including assaying for the 
reduction of virus-induced cytopathic effect; inhibition of plaque formation, reduced 
* production of viral antigens as detected by immunofluoresence assay; reduced viral RNA 
accumulation; reduction in released infectious virus from treated and untreated control and 
5 chimera samples using a plaque assay. In addition, it is contemplated that a cell line that is 
designed for pestivirus-specific transactivation of a reporter gene could be used directly or in 
lieu of a plaque assay. The reporter gene is operably linked to a promoter that is activated 
upon infection by the chimeric virus and production of the viral transactivator protein. 

Preferred embodiments of the invention are described in the following examples. 
1 0 Other embodiments within the scope of the claims herein will be apparent to one skilled in the 
art from consideration of the specification or practice of the invention as disclosed herein. It 
is intended that the specification, together with the examples, be considered exemplary only, 
with the scope and spirit of the invention being indicated by the claims which follow the 
examples. 

15 Example 1 

This example illustrates the construction and analysis of 5 ' HCV-BVDV chimeras as 
reported in detail in Frolov et al. (RNA 4:141 8-1435, 1998) which is incorporated in its f $- 
entirety by reference. A functional clone of BVDV (Mendez et al., J. Virol. 72:4737-4745, 
1998) was used to construct and characterize a series of 5' NTR chimeras with sequences ... 
20 derived from HCV and the picornavirus, encephalomyocarditis virus (EMCV). The results 
help to define the requirements of a functional BVDV 5* NTR and provide replication- 
competent BVDV-HCV chimeras dependent on a functional HCV IRES. 

Example 2 

This example illustrates the construction of chimeras for expressing additional 
25 functional portions of the HCV genome by addition of further HCV sequence downstream 
from the functional or adapted HCV 5 'NTR chimeras fused in-frame to the BVDV ORF. 

One such construct (Figure 21) involves fusion of HCV sequences to BVDV 
sequences in the p7 protein coding region (at a convenient BseRI restriction site). Both HCV 
and BVDV encode a p7 protein that is located immediately downstream of the E2 protein. 
30 The p7 protein is a small hydrophobic protein of unknown function. pCBV/p7 consists of the 
first 79 bases of the BVDV STJTR encoding stem loop structure Bl' and Bl, followed by the 
entire HCV 5*NTR, the entire HCV structural protein coding region and the first 36 amino 
acids of HCV p7 fused to the C-terminal 3 1 amino acids of BVDV p7. The fused p7 gene is 
followed by the remainder of the BVDV ORF including the entire nonstructural region and 
35 the BVDV 3' NTR. Transfection of MDBK cells with the RNA corresponding to this 
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sequence (Figure 22) leads to replication of the chimeric RNA and production of the expected 
HCV and BVDV polyprotein cleavage products. Variations on this strategy are envisioned in 
which all or part of the HCV polyprotein and cis elements important for RNA packaging can 
be expressed in viable chimeras. In addition the BVDV replicase regions for either cytopathic 
5 or non-cytopathic pesti viruses (like NADL clns-) can be used. Transfection of cells 

permissive for HCV particle, assembly, release and reinfection with this chimeric RNA can 
be used to make HCV-like particles. These particles and this infection system can be used (i) 
to screen for specific inhibitors of HCV particle, assembly, release and reinfection, (ii) for 
identifying antibodies capable of neutralizing HCV infectivity and (iii) as live or inactivated 
10 vaccines. Furthermore, this embodiment of the invention demonstrates that the BVDV RNA 
replication machinery can be used for expression of heterologous RNA and polypeptide 
sequences and can be used as a vehicle for RNA or DNA "genetic" vaccination in which the 
BVDV replicase amplifies the level of antigen expression by cytoplasmic RNA-dependent 
replication. 

15 

Example 3 

This example illustrates chimeric RNA's that are modified to express dominant 
selectable markers, assayable markers or FACS sortable markers. 

Such variants can be used to select for chimeras capable of replication in particular 

20 cell types, or to screen for cell types that are permissive for replication of the chimeric RNA. 
Selectable markers include, but are not limited to, the genes encoding puromycin resistance 
(puromycin N-acetyl transferase; PAC), neomycin resistance, blasticidin resistance, 
hygromycin resistance, etc. Assayable markers include, but are not limited to, the genes 
encoding B-galactosidase, luciferase, B-glucuronidase, etc. Easily sortable molecules include 

25 single chain antibodies, cell surface markers, and non-toxic protein markers like green 
fluorescent protein. In a specific example (Figures 23 and 24), the RNA encoded by 
pCBV/p7 was modified to include a cassette at the beginning of the BVDV 3*NTR that is 
comprised of the EMCV IRES driving the gene encoding PAC. This chimeric RNA can 
replicate, expresses PAC and confers resistance to puromycin resistance. This property can 

30 be used to select for variants of the chimera that are capable of noncytopathic replication in 
desired cells type and also provides a means of showing that cells harbor a functional 
chimeric RNA. Desired variants can be identified, cloned and further characterized as 
described in Example 1 . Of note, is that this location in the BVDV genome and this strategy 
for expressing heterologous genes may also be applied to using infectious attenuated 
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pestiviruses as gene expression vectors and as chimeric live vaccines against other animal 
pathogens. 

Example 4 

5 

This example illustrates the use of the bicistronic strategy as an alternative to the in- 
frame fusions described in Example 2. 

A specific example is shown in Figure 25 and its sequence as Figure 26. In this 
bicistronic chimera, the 5' sequences are identical to that of pCBV/p7 except that the HCV 

10 ORF continues to include the first 246 amino acids of NS4B. The HCV sequence is followed 
by the EMCV IRES fused to BVDV Npro, the N-terminal 10 aa of BVDV C, the C-terminal 
19 aa of C, 9 N-terminal amino acids of Ems, 48 C-terminal amino acids of E2 and the 
remainder of the BVDV NADL ORF and 3' NTR. The constructed BVDV ORF encodes a 
functional BVDV RNA replicase. The deletions in the N-terminal portion of this ORF were 

1 5 designed to preserve proper membrane topology and processing of the replicase. The 

bicistronic chimeric RNA can replicate upon transfection of permissive BVDV host cells. 

-'<- 

Example 5 

20 This example illustrates 3'NTR chimeras. Although initial attempts to recover viable 

chimeric viruses in which the BVDV 3*NTR was completely replaced by that of HCV were 
unsuccessful, a strategy similar to that detailed in Example 1 has produced chimeras that 
harbor the conserved elements of the HCV 3'NTR. An initial tandem 3'NTR construct was 
made in which the HCV 3'NTR was engineered to follow the BVDV ORF. The complete 

25 BVDV 3 'NTR was position 3' to the HCV 3' NTR after a short heterologous sequence. This 
sequence of this parental construct, which replicated poorly, is shown in Figure 19 RNAs 
transcribed from this plasmid were of low specific infectivity suggesting that revertants or 
pseudorevertants might have arisen. Indeed isolation and sequence analysis of several 
independent plaque- forming variants revealed that deletions in the HCV poly U tract of 

30 various lengths had occurred. These revertant sequences are shown in Figure 20. When these 
altered HCV 3TSITRs were reconstituted into the original tandem 3' NTR parent, they gave 
rise to plaque forming RNA transcripts of high specific infectivity, demonstrating that these 
alterations restored the ability of the chimeric RNA to replicate. Large deletions in the U tract 
gave rise to virus with more robust replication and larger plaques while stably maintaining the 

35 conserved HCV 3'NTR 98-base element and the polypyrimidine "transition" region. Such 
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chimeric viruses can now be used to screen and evaluate antisense, ribozyme, and other 
therapeutics targeted against this conserved HCV RNA element that is essential for 
'replication. 

5 Materials and Methods 

Plasmid Constructs 

pACNR/BVDV NADL was previously described (Mendez et aL, 1998, supra). 
pBVDV is a derivative of pACNR/BVDV NADL which contains a G->T transversion at nt 
14994 that creates an Xba I site upstream of the 77 promoter (T. Myers & CM. Rice, 

10 unpubl.). To facilitate construction of the chimeras, subclones were created. First, two 
fragments were isolated by PCR amplification of p90/HCVFLIongpU (Kolykhalov et al., 
Science 277:570-574, 1997) with primers #498 (5 '-TGT AC ATGGC ACGTGCC AGCCCC) 
and #498 (5-GATCAACTCCATGGTGCACGGTCT) and pBVDV with primers #481 (5'- 
AGACCGTGCACCATGGAGTTGATC) and #482 (5'- 

1 5 CGTTTCACACATGGATCCCTCCTC). These two fragments were digested with ApdL I 

and ligated to produce a fragment containing a fusion of the HCV 5' NTR to the BVDV ORF. 
This fragment was digested with SacI and ligated into pGEM3Zf(-) which had been digested 
with Sma 1 and Sac 1 to produce the subclone pGEM498-Sacl. Next, a fragment containing 
the BVDV 5' NTR was synthesized by PCR amplification of pBVDV with primers #183 (5'- 

20 TTTTCTAGATAATACGACTCACTATAGTATACGAGAATTAGAAAAGGCACTCG) 
and #480 (5 f -GGGGGCTGGCACGTGCCATGTACA). This fragment was digested with 
Xba I and BsrG I and ligated into pGEM498-Sad digested with the same two enzymes, to 
create the plasmid pGEMXbal-Sacl. pGemXbal-Sacl contains a tandem fusion of the BVDV 
5' NTR, the HCV 5' NTR, and the 5' portion of the BVDV N*™ gene. pBVDV + HCV was 

25 created by digesting pGEMXbal-Sacl with Xba I and Sac I and ligating the fragment into 
pBVDV digested with the same two enzymes, and as such pBVDV + HCV contains the T7 
promoter, followed by the entire 385-nt 5' NTR of BVDV, a GT dinucleotide (nt 386-387), 
the entire 341-nt 5* NTR of HCV (nt 388-728), and the sequence of the BVDV NADL strain 
including the ORF and 3' NTR. Derivatives of pBVDV + HCV containing deletions within 

30 the BVDV 5' NTR and/or the HCV 5' NTR were created in the subclone pGEMXbal-Sacl, as 
described below, prior to ligation into Sba I- and Sac I-digested pBVDV. For making 
deletions, restrictions sites with non-compatible protruding ends were treated with the 
Klenow fragment of DNA polymerase I prior to ligation. For creation of pBVDV + 
HCVdelB3 (deletion of nt 174-374, inclusive), pGEMXbal-Sacl was digested with Afl II and 

35 BsrG I. For pBVDV + HCVdelB2B3 (deletion of nt 67-374), pGEMXbal-Sacl was digested 
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with Avr II and BsrG I. For pBVDV + HCVdelBlB2B3 (deletion of nt 33-374), pGEMXbal- 
Sacl was digested with SnaB I and BsrG I. For pBVDV + HCVdelB2B3Hl (deletion of nt 
67-3396), pGEMXbal-Sacl was digested with Avr U and Xcm I. For pBVDV + 
HCVdelB2B3HlH2 (deletion of nt 67-513), pGEMXbal-Sacl was digested with AVR II and 
5 Bsg I. For pBVDV + HCVdelB2B3H3 (deletion of nt 67-374, 5 1 8-704), subclone 

pGEMXbal-SacidelB2B3 was digested with Sma I. p5'HCV was created by digesting 
p90/HCVliongpU with Xba I and Nru I and ligating the fragment into pB VDV + HCV 
digested with the same two enzymes. 

The EMCV plasmid, pEC g , was provided by Ann Palmenberg and is described 

10 elsewhere (Hahn et al., J. Virol 69:2697-2699, 1995). p5'EMCV contains the entire 710 nt of 
the 5' NTR of EMCV, followed by the open reading frame of B VDV and the 3' NTR. One 
extra G residue was added between the T7 promoter and the first nucleotide of the EMCV 5' 
NTR to facilitate efficient in vitro transcription. Convenient restriction sites within the 
BVDV 5* NTR or the EMCV 5' NTR were used to create additional chimeras. Sites with 

1 5 noncornpatible protruding ends were treated with the Klenow fragment of DNA polymerase I 
prior to ligation. For example, the plasmid pBVDV + EMCVdelA contains nt 1-378 of 
BVDV 5' NTR fused with nt 45-710 of EMCV (the BsrG 1 site of BVDV ligated to theiEcoR 
V site of EMCV), pBVDV + EMCVdelB3A contains nt 1-173 of BVDV fused with nt 45-710 
of EMCV (the Afl II site of BVDV ligated to the EcoK V site of EMCV). pBVDV + ; 

20 EMCVdelB2B3A contains nt 1-66 of BVDV fused with nt 45-710 of EMCV (the Avr II site 
of BVDV ligated to the EcoR V site of EMCV). pBVDV + EMCVdelB3ABC contains nt 1- 
173 of BVDV fused with nt 161-710 of EMCV (the Afl II site of BVDV ligated to the 
Psp\405 site of EMCV). pBVDV + EMCVdelB2B3ABC nt 1-66 of BVDV fused with nt 
161-710 of EMCV (the Avr II site of BVDV ligated to the Psp\406 site of EMCV). pBVDV 

25 + EMCVdelB3A-H contains nt 1-101 of BVDV fused with nt 289-710 of EMCV (the Nhe I 
site of BVDV ligated to the Avr II site of EMCV). pBVDV + EMCVdelB2B3A-H contains 
nt 1-62 of BVDV fused with nt 289-710 of EMCV (the Avr II site of BVDV ligated to the Avr 
II site of EMCV). The schematics of the chimeric 5' NTRs are presented in Figures 2 and 4. 
All other heterologous 5' NTRs used in the study were generated by PCR using an 

30 oligonucleotide complementary to nt256-272 of the HCV 5* NTR and primers containing the 
sequence of the Xba I restriction site followed by the T7 promoter, the heterologous 
sequences found in sequenced pseudorevertants, or sequences corresponding to different 
regions of the HCV 5* NTR. All the fragments were subcloned into the plasmid, pRS2 (a 
derivative of pUC19), sequenced, and recloned into the p5'HCV plasmid by replacing the 
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fragment between the XBa I site located upstream of the T7 promoter and the Nhe I site (nt 
249-254) in the 5' NTR of HCV. 
Cell cultures 

MDBK cells were obtained from M. Collett (ViroPharma, Inc.) and BT cells were 
5 obtained from the American Type Culture Collection (Rockville, Maryland). Cells were 

grown in Dulbecco's modified Eagle medium (D-MEM) supplemented with 10% horse serum 
and sodium pyruvate. 
Transcriptions and transfections 

All the designed plasmids , including pBVDV and the chimeric derivatives, were 

1 0 digested to completion with Sda I (&£>83871), purified by phenol extraction, precipitated by 
ethanol, and dissolved in water. The transcription reactions were performed sin the T7 
Megascript kit (AMBION) using the conditions recommended by the manufacturer. 
Reactions were incubated at 37°C for 1 h, and 3 H-UTP was added to the reaction to quantify 
the RNA synthesis. The quality of the synthesized RNAs was checked by agarose gel 

15 electrophoresis, and samples containing 50-60% of full-length RNA were used for 

electroporations and in vitro translations. The reaction mixtures were aliquoted and stored at 
-70°C prior to electroporation or in vitro translations. 

Transfection was performed by electroporation of MDBK cells using previously 
described conditions (Mendez et al., 1998, supra). Two micrograms of in vitro synthesized 

20 RNA, corresponding to approximately 1 ^ g of the full-length transcript, were used per 
electroporation. In standard experiments, ten-fold dilutions of electroporated cells were 
seeded in 6- well tissue culture plates containing 5 x 10 5 naive MDBK cells per well. After 1 
h of incubation at 37°C in an 5% C0 2 incubator, cells were overlaid with 3 ml of 0.6% LE 
Sea Kem agarose (FMC Bioproducts) containing minimal essential medium supplemented 

25 with 5% horse serum. Plaques were stained with crystal violet after 3 days incubation at 
37°C. The rest of the transfected cells was seeded into 100-mm dishes and incubated for 
approximately 48 h or until cytopathic effect was observed in virtually all cells. Samples of 
the media were taken at 24 and 48 h, and virus titers were determined as described above and 
previously (Mendez et al., 1998, supra). 

30 Analysis of the 5' ends of viral genomes 

Sequencing of the 5 f ends of selected variant$ of BVDV was performed on plaque- 
purified viruses. Plaques were typically isolated from the agarose overlay without staining 
with neutral red. Virus was eluted in 1 ml of D-MEM/ 10% horse serum for several hours and 
was used to infect 5 x 10 5 MDBK cells in 35-mm dishes. After 1 h of virus adsorption of 37 
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°C, an additional 1 ml of D-MEM/10% horse serum was added to the dishes, and incubation 
was continued for 36-48 h until cytopathic effect was observed in virtually all cells. 

Fifty microliters of harvested viral stocks were clarified by low speed centrifugation, 
and viral RNAs were isolated by TRIzol reagent (Gibco-BRL) using the protocol 
5 recommended by the manufacturer. Sequencing of the 5' termini was performed using an 
oligonucleotide/cDNA-ligation strategy described elsewhere (Troutt et al. s Proc, Natl Acad. 
Sci. USA 59:9823-9825, 1992). The primer SI (5 '-GTCGTTTC AC AC ATGG ATCC), 
complementary to nt 710-729 of the BVDV genome, was used for cDNA synthesis. A 
phosphorylated oligonucleotide tag (S'-GACTGTTGTGGCCTGCAGGGCCGAATT) with an 
10 amino group on the 3' terminus was ligated to the first strand cDNA (Troutt et al., 1992, 

supra). One tenth of this reaction mixture was used for PCR amplification. The primers for 
PCR amplification were as follows: primer A (5 f -GCCCTGCAGGCCACAACAGTC), 
complementary to the tag; primer B (5-TCAGGCAGTACCACAA) complementary to nt 
281-296 of the HCV 5' NTR; and primer C (S'-GGAATGCTCGTCAAGAAGACAG), 
1 5 complementary to nt 268-289 of the EMCV 5' NTR. The primer pairs of A + B or A + C 
were used for analysis of the pseudorevertants of S'HCV and BVDV + HCVdelBlB2B3 or 
5'EMCV, respectively. For the 5'HCV pseudorevertants, one tenth of the ligation mixture 
was used for an additional PCR reaction. This fragment was synthesized using primer SI, 
describe above, and a primer corresponding to nt 147-175 of the HCV genome. Fragments 
20 were purified by agarose gel electrophoresis and cloned into the plasmid pRS2. Multiple 
independent clones were sequenced by the standard dideoxy-mediated chain termination 
methods using the Sequenase version 2.0 DNA Sequencing Kit (USB). 
Cell-free translation 

Cell-free translation reactions were performed in reticulocyte extracts (Promega) 
25 using conditions recommended by the manufacture. Usually 0.1-1 jig of the same in vitro 
synthesized RNAs used in transfection experiments were used in 25 ^il translation reactions. 
After 45 min of incubation at 30 °C, 2 jal were dissolved in 10 jxl of sample buffer, and those 
samples were analyzed by sodium dodecyl sulfate PAGE. Labeled proteins were visualized 
by autoradiography of the dried gel. The efficiency of translation was measured using 
30 phosphorimager analysis (Molecular Dynamics) by comparing the radioactivity in the band 
corresponding to the N pr0 protein. In preliminary experiments, an eightfold increase in 
incorporation was observed for translation of 4 jig versus 0.4 jag BVDV transcript RNA. 
Quantitative data were obtained from reactions using subsaturating (0.4 ^ig) amounts of 
BVDV or BVDV chimera transcript RNAs. 
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Analysis of virus specific RNAs 

The protocols used for radioactive labeling of virus-specific RNAs are described in 
- - the appropriate figure legends. RNAs were isolated from the cells by using TRIzol reagent ai 
recommended by the manufacturer (Gibco-BRL). After denaturation with glyoxal in 
5 dimethylsulfoxide, cellular RNAs were analyzed by electrophoresis in a 1% agarose gel 
containing a 10 mM phosphate buffer. Pieces of the dried gel containing the appropriate 
RNA bands were excised, and their radioactivity measured by liquid scintillation counting. 

Results 

1 0 Features of the BVDV, HCV, and EMCV 5' NTRs important for chimera design 

Schematic representations of the proposed secondary structures of the 5' NTRs of 
HCV, BVDV, and EMCV are shown, and the location of each IRES is indicated in Figure 1. 
EMCV is a member of the cardiovirus genus within the family Picornaviridae. While not a 
member of the Flaviviridae, EMCV is similar to HCV and BVDV in that it is a positive- 

15 strand RNA virus shown to contain an IRES within its 5' NTR (Jang et al., J. virol 62:2636- 
2643, 1988). Based on their proposed secondary structures, the HCV IRES and the BVDV 
IRES have been classified as type 3 IRESs, while the EMCV IRES is classified as a type 2 
IRES (Lemon & Honda, Siemin. Virol. 5:274-288, 1997). However, these three IRESs as 
well as IRESs from other members of the Flaviviridae and the Picornaviridae have been 

20 proposed to contain a common structural core (Le et al., Virus Genes 12: 135-147, 1996). 

The model for the secondary structure of the 341-nt HCV 5' NTR has been refined by 
enzymatic and chemical analysis of synthetic transcripts (Brown et al., Nucl. Acids. Res. 
20:5041-5045, 1992; Wang et al., J. Virol 65:7301-7307, 1994; Honda et al., RNA 2/955-968, 
1996; Lima et al., 1997). This element contains four discreet hairpins (referred to here as HI, 

25 H2, H3 and H4) and a pseudoknot at the base of hairpin H3 (Wang et al., 1 995). The 

secondary structure of the 385-nt BVDV 5' NTR has not been as extensively studied, but is 
proposed to be similar to that of HCV (Brown et al., 1992) with four discrete hairpins 
(referred to here as Bl\ Bl, B2, and B3) and a pseudoknot at the base of B3 (Rijnbrand et al., 
1997). The secondary structure of the longer (>700 nt) EMCV 5' NTR consists of a series of 

30 hairpins A-M (Duke et al., 1992; Hoffman & Palmenberg, 1996). Recently, a revised model 
of the EMCV 5' NTR suggests moderately different secondary structures for the C and G 
subregions, and significantly different secondary structures for the I-M subregion 
(Palmenberg & Sgro, 1 997). 

For HCV, HI is nonessential for IRES function (Reynolds et al., 1995; Rijnbrand et 

35 al., 1995; Honda et al., 1996b; Reynolds et al., 1996; Kamoshita et al., 1997) and its deletion 
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has actually increased translation efficiency in some analyses (Rijnbrand et al., 1995; Honda 
et al., 1996b). Most studies have found that hairpin H2 and H3 and the pseudoknot are 
essential for IRES function (Wang et al., 1993; Rijnbrand et al., 1995; Honda et al., 1996b). 
However, two studies indicate that H2 may not be essential (Tsukiyama-Kohara et al., 1992; 
5 Urabe et al., 1997). The 3' boundary of the HCV IRES is more controversial. The IRES 

clearly extends to the AUG initiation codon. However, some studies indicate that sequences 
affecting the efficiency of translation initiation extend into the ORF (Reynolds et al., 1995; 
Honda et al., 1996a; Honda et al., 1996b; Lu & Wimmer, 1996). By analogy to the HCV 
IRES and the related pestivirus CSFV IRES, the BVDV IRES probably requires hairpins B2 

1 0 and B3 and the pseudoknot for function, with B 1 1 and B 1 probably not required for IRES 
activity (Poole et al., 1995; Rijnbrand et al., 1997). For EMCV, hairpins H-L have been 
shown to be required for IRES function in mono- or dicistronic constructs (Jang & Wimmer, 
1990; Duke et al., 1992). The remaining portion of the EMCV 5' NTR is thought to be 
required for RNA replication or unknown steps in viral replication that are important for 

1 5 pathogenesis (Duke et al., 1 990; Martin & Palmenberg, 1 996). 

Replacement of the BVDV 5 f NTR with the HCV 5' NTR results in a large decrease in 
specific infectivity 

Since the BVDV 5 f NTR and the HCV 5' NTR are proposed to have similar RNA 
20 secondary structure and functional organization, an experiment was performed to test whether 
the BVDV 5* NTR could be replaced by the HCV 5' NTR. p5' HCV has an exact replacement 
of the BVDV 5' NTR with that of HCV (Fig. 2A) while the coding sequence and 3' NTR of 
p5*HCV are identical to pBVDV. Positioning of the HCV 5' NTR in such a manner was 
necessary since translation initiation from the HCV IRES begins at or near the AUG start 
25 codon (Honda et al., 1996a; Reynolds et al., 1995; Reynolds et al., 1996; Rijnbrand et al., 

1996). The specific infectivity of 5'HCV RNA synthesized in vitro was compared to that of 
BVDV RNA by transfection of MDBK (bovine kidney) cells (Fig. 2 A). The specific 
infectivity of BVDV RNA was approximately 4 x 10 6 plaque forming units (PFU)/^g RNA. 
In contrast, the specific infectivity of 5' HCV RNA was near the limit of detection (30-50 
30 PFU/|ag RNA) and considerable plaque heterogeneity was apparent. These results suggested 
that the HCV 5 1 NTR replacement chimera might be incapable of efficient replication and 
plaque formation and that the plaque forming virus observed had arisen by secondary 
mutation(s). Sequence analysis of plaque-purified 5* HCV viruses presented below confirmed 
that the replicating pool of virus contained such pseudorevertants. 
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Next, the in vitro translation efficiency of these two RNAs in rabbit reticulocyte 
extracts was analyzed to test whether the defect in specific infectivity of 5* HCV RNA could 
be attributed to lower translation efficiency. Although the specific infectivity of 5* HCV RNA 
was reduced -5 logs compared to BVDV RNA, its translation efficiency was only slightly 
5 reduced, -twofold (Fig. 3, lane 1 vs. lane 2). The apparent size of the N-terminal cleavage 
product, N pro , was identical for both RNAs, suggesting that translation initiated with the 
correct AUG. These data are consistent with the hypothesis that the BVDV 5' NTR contains 
signals that are required for a step in replication other than translation which are not present in 
the 5' HCV chimera. 

10 Given the low specific infectivity of 5' HCV RNA, an experiment was performed to 

test the effect of placing the BVDV 5' NTR sequence upstream of the HCV 5' NTR, resulting 
in tandem BVDV and HCV 5' NTRs (called BVDV + HCV). This arrangement actually 
decreased translation efficiency (Fig. 3, lane 14 vs. lane 1) yet restored infectivity (Fig. 2A). 
The plaques produced by BVDV + HCV were also heterogeneous in size, indicating that this 

1 5 virus was unstable. Upon passage, RT-PCR analysis indicated that pseudorevertants had 

indeed arisen in which portions of the BVDV and/or HCV 5' NTRs had been deleted (data not 
shown). These data show that sequences in the BVDV 5' NTR required for virus replication 
can function when placed upstream of a functional HCV IRES driving translation of the 
BVDV polyprotein. 

20 

Hairpins Br and Bl in conjunction with the HCV IRES are sufficient for stable and 
efficient BVDV replication 

The sequences within the BVDV 5 1 NTR that restored replication in the context of the 
HCV 5' NTR were mapped using three deletion variants. The deletion BVDV + HCVdelB3 

25 removed a large portion of hairpin B3; the deletion within BVDV + HCVdelB2B3 removed 
hairpins B2 and B3, and the deletion within BVDV + HCVdelBlB2B3 removed hairpins Bl, 
B2 and B3. The specific infectivities of RNAs from these deletion mutants were near that of 
BVDV RNA (Fig. 2). Upon passage of these viruses, RT-PCR analyses and sequencing 
indicated that BVDV + HCV delB3 and BVDV + HCVdelB2B3 were stably propagated and 

30 produced homogeneous plaques slightly smaller than those of wild-type BVDV (data not 
shown). In contrast, BVDV + HCVdelBlB2B3 produced smaller heterogeneous plaques. 
Reverse transcription-polymerase chain reaction (RT-PCR) analysis and sequencing indicated 
that BVDV + HCVdelBlB2B3 underwent a reversion event described in more detail below. 
The translation efficiencies of these three RNAs (Fig. 3, lanes 9, 10, and 12) were similar to 

35 BVDV + HCV RNA (Fig. 3, lane 14), indicating that the deleted portions (hairpins Bl, B2, 
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and B3) are not required for translation in the BVDV + HCV chimera. These results show 
that BT and Bl are the minimal elements sufficient for stable replication in conjunction with 
the HCV 5' NTR. 

Having shown that Bl 1 and Bl are sufficient for replication in conjunction with the 
5 HCV 5' NTR, we next conducted a deletion analysis to determine the sequences within the 
HCV 5* NTR of BVDV + HCV delB2B3 required for replication. A large portion of HI was 
deleted in BVDV + HCV delB2B3Hl, while both HI and H2 were deleted in BVDV + HCV 
de!B2B3H!H2. Of these two RNAs, only BVDV + HCV delB2B3Hl was as infectious as 
parental BVDV RNA (Fig. 2B). However, the BVDV + HCV delB2B3Hl virus produced 

1 0 smaller plaques than BVDV + HCV delB2B3, indicating that hairpin HI may augment 
replication of the chimera. In contrast, BVDV + HCV delB2B3HlH2 RNA was not 
infectious (Fig. 2B) and was translated poorly (Fig. 3, lane 1 1). Diminished HCV IRES 
activity might be due to deletion of hairpin H2 or juxtaposition of BVDV hairpins BT and Bl 
with H3. A third derivative of BVDV + HCV delB2B3, with a Sma \-Sma I deletion 

1 5 abrogating HCV IRES function by removing H3, was also not infectious (data not shown). 
Thus, a 5' NTR consisting of BT and Bl and a functional HCV IRES is sufficient for stable 
BVDV replication in MDBK cells. Similar results were obtained in BT cells, another BVDV- 
permissive continuous bovine cell line (data not shown). 

20 Replacement of the BVDV 5' NTR with the EMCV 5 f NTR 

The following experiment was performed to determine whether the BVDV 5 1 NTR 
could be replaced by the 5' NTR of a more phylogenetically distant virus, EMCV. A 
derivative of BVDV was created, called 5' EMCV, that contains an exact replacement of the 
BVDV 5' NTR with the EMCV 5' NTR plus an additional guanosine residue at the 5' terminus 

25 for more efficient transcription initiation of T7 polymerase (Fig. 4A). The specific infectivity 
of 5* EMCV RNA was more than three orders of magnitude lower than BVDV RNA, 
indicating that it was defective for replication, although its specific infectivity was higher than 
that of 5' HCV RNA (compare Figs. 4A and 2A). Similar to 5' HCV, 5' EMCV produced 
heterogeneous plaques, and sequence analysis indicated that pseudorevertants had arisen. The 

30 lower specific infectivity of 5* EMCV RNA was not likely because of a defect in translation, 
since the translation efficiency of 5' EMCV RNA was about threefold higher in vitro than that 
of BVDV RNA (Fig. 3, lane 20 vs. lane 19). 

Similar to BVDV + HCV, it was also determined whether the BVDV 5' NTR at the 5 ' 
end of the 5* EMCV RNA would increase its specific infectivity. BVDV + EMCVdelA (Fig. 

35 4A) contained the entire BVDV 5' NTR in tandem with the EMCV 5' NTR lacking a portion 
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of hairpin A. BVDV + EMCVdelA RNA had a specific infectivity near that of BDVD RNA 
(compare Figs. 4A and 2A) despite having a lower translation efficiency than 5' EMCV (Fig. 
3, lane 21 vs. lane 20). - Similar to the results with BVDV + HCV, this implicates the added 
BVDV 5' NTR sequence for a step in viral replication other than translation. Two derivatives 
5 of BVDV + EMCVdelA that contain deletions of portions of the BDVD 5* NTR but maintain 
the sequence of Bl' and Bl, BDVD + EMCVdelB3A and BVDV + EMCVdelB2B3A (Fig. 
4 A), also were infectious. These derivatives had translation efficiencies near that of the 
parental BVDV + EMCVdelA (Fig. 3, compare lanes 15 and 16 with lane 21). This 
demonstrated that hairpins Bl' and Bl were sufficient for replication in conjunction with a 

10 large portion of the EMCV 5' NTR. Derivatives of BVDV + EMCVdelB3A or BVDV + 

EMCVdelB2B3A that contain further deletions of EMCV (BVDV _ EMCVdelB3 ABC and 
BVDV + EMCVdelB2B3ABC in particular) were translated efficiently (Fig. 3, lanes 17 and 
18) and were infectious (Fig. 4B). This indicates that the chimeras did not require putative 
EMCV RNA replication signals (Martin & Palmenberg, 1996). However, derivatives with 

1 5 deletions extending into the canonical EMCV IRES were not infectious. For example, BVDV 
+ EMCVdelB3A-H and BVDV + EMCVdelB2B3A-H, in which a portion of hairpin H is 
deleted, were not infectious (Fig. 4B) and were inefficiently translated in vitro (Fig. 3, lanes 
22 and 23). It should be noted that all of the BVDV + EMCV chimeras produced plaques of 
heterogeneous size, indicating some instability. 

20 

Relatively simple 5* NTR mutations are observed in adapted pseudorevertants 

As mentioned previously, BVDV + HCVdelBlB2B3 did not replicate stably as 
indicated by the heterogeneity in the size of plaques produced by this virus. Upon passage 
and selection of medium plaque-producing variants, 5* RACE analysis and sequencing 

25 indicated that nt 1-26 had been deleted in the pseudorevertants, removing a large portion of 
Bl 1 which was apparently deleterious in the absence of Bl. This deletion results in the 5* 
terminal sequence 5'GUAUCG which is identical to the first six bases of BVDV genome 
RNA (Fig. 5) and is repeated at positions 27-32. 

Analysis of the passaged 5' EMCV virus indicated that the replicating progeny had 

30 also undergone a simple deletion of sequence at the 5* end to generate more efficiently 

replicating variants (Fig. 5). After electroporation, the 5' EMCV virus pool was passaged 5 
times at a multiplicity of infection of 0.1-1 PFU/cell on MDBK or BT cells, and the 5' termini 
of three randomly picked plaques were sequenced. For all three plaques selected, nt 2-209 
had been deleted, again creating a genome RNA with the 5' terminal tetranucleotide sequence 

35 5'-GUAU. 



BNSDOC1D: <WO 9955366A1 J_> 



WO 99/55366 




PCT/US99/08850 



27 

Analysis of the 5* HCV progeny indicated that more complicated variants had arisen. 
Most small plaque-producing variants were unstable and quickly reverted to medium plaque- 
' producing variants. However, one small plaque-producing variant and two stable medium 
plaque-producing variants were isolated. 5' terminal sequences of the variants were amplified 
5 by rapid amplification of cDNA ends (RACE) and cloned into a plasmid vector, and 

sequences for several independent colonies were determined. The sequence of three clones of 
the small plaque-producing virus (5'HCV.Rl) contained a deletion of HCV sequence from nt 
1-34 and an addition of the dinucleotides 5*-AU in two clones and 5'-GU in the third clone. 
This creates a 5' terminus of 5-(G/A) UAA (Fig. 5B), reminiscent of the first three bases of 

10 the BVDV genome RNA (5-GUA). Both medium plaque variants appeared to have arisen by 
RNA recombination with non-viral sequences (Fig. 5). One medium plaque variant (5 ! 
HCV.R2) had deleted the first 21 bases of the HCV sequence and contained instead a 
heterologous sequence of 22 bases. BLAST searches revealed a perfect match between this 
sequence and a sequence in a human retina cDNA of unknown function (Tsp509I). The 

1 5 second medium plaque variant (5* HCV.R3) had also undergone a possible recombination 
event leading to the addition of 12 nt to the 5' end of the HCV sequence. Given its short 
length, multiple matches were found in the database with this sequence. As for the small . 
plaque variant, sequencing of multiple clones revealed heterogeneity oat the extreme 5 ? end, 
with either G of A identified as the 5* base. Remarkably, for both medium plaque variants, 

20 the fused heterologous sequence began with the tetranucelotide sequence 5'-(G/A) UAU (Fig. 
5B). For all three variants, sequencing of the entire 5' NTR and a portion of the NT 0 coding 
region revealed only these changes at the 5* termini. 

5' NTR sequence changes are sufficient for the pseudorevertant phenotypes 

25 To assess the importance of these alterations oat the 5' terminus of the 5' HCV 

pseudorevertants, derivatives of 5* HCV were created with the changes determined by 5 ! 
RACE (Fig. 6A) and analyzed the specific infectivities of these RNAs (Fig. 6B). 
Corresponding to the small plaque variant, a derivative called 5' HCV.R1 orig was engineered 
which contained a 5' NTR consisting of the dinucleotide 5' -GU at the 5* terminus of HCV nt 

30 35-341 . This results in a 5" terminus consisting of 5'-GUAA. 5'HCV.Rl orig RNA had a 

specific infectivity at least four orders of magnitude higher than 5 1 HCV RNA (Figs. 6B and 
2A). This demonstrates that this 5' NTR structure is sufficient for phenotypic reversion to 
high specific infectivity. However, small plaques and considerable heterogeneity were 
observed for 5'HCV.Rl orig suggesting that additional mutations may be present in the 

35 original small plaque variant. 



BNSDOCID: <WO 9955366A1J_> 



WO 99/55366 




PCT/US99/08850 



28 

The engineered derivative S'HCV^orig had a 5' NTR consisting of 22 nt of 
Tsp509I-homologous sequence followed by HCV nt 22-341 . Another construct, called 
S'HCV.RSorig was made, which has the 12 nt of the other heterologous sequence fused to the 
intact HCV 5' NTR. Specific infectivities for both these derivatives were essentially the same 
5 as observed for wild type BVDV RNA (2-4 x 10 6 PFU/^ig; Fig. 6B). Transfection with these 
transcripts produced medium plaques, as observed for the original variants, and this 
phenotype was stable upon passaging. These results show that the altered 5'NTR sequences 
were responsible for the pseudorevertant phenotypes rather than changes elsewhere in their 
genomes. 



Addition of the tetranucleotide sequence 5'-GUAU to the HCV 5 f NTR allows efficient 
15 BVDV replication 

For all three 5' HCV variants studied, as well as the BVDV + HCV delBlB2B3 and 
5'EMCV pseudorevertants, 5' NTR alterations seemed to involve creation of a three- or four- 
base "consensus" sequence identical to the 5 1 terminus of BVDV genome RNA. To test the 
importance of this sequence, as opposed to fused heterologous sequences, we created a set of 

20 variants with the BVDV 5 1 tetranucleotide sequence linked to the HCV 5' NTR or the 
deletion/recombinant break points identified during sequence analysis of the 5' HCV 
pseudorevertants (Fig. 6A). 5' HCV .Rl cons had the tetranucleotide sequence 5-GUAU fused 
to HCV nt 35-341. 5'HCV.R2cons had the S'-GUAU tetranucleotide sequence fused to HCV 
nt 22-341 . 5'HCV.R3cons contained the tetranucleotide sequence 5-Guau fused to the intact 

25 5' terminus of the HCV NTR. RNAs from all three of these derivatives had specific 

infectivities more than five orders of magnitude higher than 5 'HCV and comparable to 
parental BVDV (Fig. 6B). 

There were, however, significant differences between the phenotypes of some of 
these derivatives versus the reconstructed pseudorevertants. As mentioned above, 

30 5'HCV.Rlorig yielded tiny and small plaques and produced low virus yields even after 48 h. 
In contrast, the addition of four bases rather than two bases (5-GUAU vs. 5'-GU) yielded 
virus with near wild-type plaque morphology (Fig. 6B) and growth Rates (Fig. 7). In the case 
of the smaller deletion, 5'HCV.R2orig and 5'HCV.R2cons were indistinguishable, suggesting 
that, other than the 5* four bases, the fused heterologous sequences were dispensable. This 

35 was not he case, however, for the chimera containing the 5-GUAU tetranucleotide sequence 



BNSDOCID: <WO 9955366A1_I_> 



WO 99/55366 




PCT/US99/08850 



fused to the intact HCV 5' NTR. 5'HCV.R3cons produced small plaques (Fig. 6B) and grew 
more slowly than 5 f HCV.R3orig (Fig. 7) suggesting that the sequence/structure of the 
sequences downstream of the 5' four bases can affect replication efficiency. 

5 The tetranucleotide sequence 5'-GUAU is important for efficient BVDV RNA 
accumulation 

Next, the effects of the different 5' termini on virus-specific RNA accumulation 
directly after transfection were analyzed. This allowed a direct comparison between 5 'HCV 
and the reconstructed pseudorevertants as well as selected BVDV + HCV deletion constructs. 

10 MDBK cells were transfected with in vitro synthesized RNAs and labeled for 10 h beginning 
at 5 h post-transfection with 3 H-UTP in the presence of actinomycin D (Fig. 8). RNA 
replication of the 5' HCV chimera was severely impaired to a level below detection (Fig. 8, 
lane 2). In contrast, every 5' NTR alteration of 5' HCV that increased RNA specific 
infectivity and allowed efficient virus growth led to readily detectable viral RNA 

1 5 accumulation. Addition of B 1 1 and B 1 to the 5' terminus of the HCV 5* NTR restored RNA 
replication to a level -50% of that observed for BVDV (BVDV + HCVdelB2B3; Fig. 8, lane 
3 vs. lane 1). BVDV + HCVdelB2B3Hl displayed reduced RNA synthesis compared ilo 
BVDV + HCVdelB2B3 (Fig. 8, lane 4 vs. lane 3) perhaps explaining its small plaque 
phenotype and suggesting a possible positive role for HI in replication of this chimera. 

20 5'HCV.Rlorig, which had exhibited plaque heterogeneity and slow growth, accumulated less 
RNA when compared to 5'HCV.Rlcons (Fig. 8, lane 5 vs. lane 6). 5'HCV.R2orig and, 
5'HCV.R2cons showed similar RNA accumulation (Fig. 8, lane 9 vs. lane 10) consistent with 
their medium plaque phenotypes; and 5'HCV.R3cons exhibited reduced RNA synthesis 
compared to 5'HCV.R3orig (Fig. 8, lane 8 vs. lane 7), consistent with their small-versus 

25 medium-plaque phenotypes. 

Although these RNA phenotypes are complex, the most striking result is that addition 
of the BT Bl hairpins, addition of heterologous 5* sequences terminating with 5'-GUAU or 
simply fusion of this tetranucleotide sequence with the HCV 5* NTR or short 5' truncations of 
the HCV 5' NTR all dramatically upregulated RNA accumulation. This occurred without 

30 increasing translation efficiency, at least as measured in a cell-free assay (Fig. 3, compare 

lanes 3-8 to lane 1), suggesting that these sequences function at the level of RNA replication 
or stability. 
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Discussion 

The work presented here helps to define the requirements for a functional BVDV 
*5 r NTR. The BVDV-specific 5 1 NTR sequences required for efficient replication in cell 
culture are minimal and consist of the 5' terminal sequence, 5'-GUAU. The sequence 5'- 
5 AUAU, detected for some pseudorevertants, may also be functional but this was not tested for 
technical reasons. This simple 5 ! -terminal tetranucleotide sequence, which is conserved 
among pestivirses (Ruggli et al., 1996; Becher et al., 1998), was shown to function in the 
context of functional IRES elements derived from the hepacivirus HCV or the picomavirus 
EMCV. As discussed below, this may indicate that the 5* signals required for BVDV RNA 

10 replication are rather simple or that elements in these heterologous IRESs can functionally 
replace deleted BVDV sequences. 

Sequences at the extreme 5' end of BVDV genome RNA could modulate the 
efficiency of RNA accumulation by affecting RNA stability, translation, promoter efficiency, 
or some combination of these processes. At this time, we can not distinguish among these 

1 5 possibilities but favor an effect on RNA replication. The complement of the BVDV 5* 

sequence at the 3' end of the negative-strand RNA presumably functions in the initiation of 
positive-strand RNA synthesis. Thus, AUAC-3' at the 3'terminus fo minus-strand RNA may 
be important for positive-strand RNA synthesis. Interestingly, for some positive-strand RNA 
viruses such as rubella virus (Pugachev & Frey, 1998), flock house virus (Ball, 1994) and 

20 turnip crinkle virus (Guan et al., 1997), only minimal exacting sequences at the 3 ! termini of 
negative-strand RNAs are required positive-strand RNA synthesis. In contrast to the 5* NTR 
replacements, we were unable to generate replication-competent BVDV-HCV replacing that 
of BVDV (data not shown). This may indicate that the signals within the pestivirus 3' NTR 
required for initiation of negative-strand RNA synthesis are more complex and virus specific. 

25 Once the replication complex has assembled at the 3 ? NTR and transversed the RNA during 
negative-strand synthesis, the requirements of the 5' NTR for initiation of positive-strand 
synthesis may be minimal. 

Although the RNA replication signals within the 5' NTR appear to be rather simple, it 
is possible that the signals important for RNA replication actually extend into the IRES and 

30 are more complicated. For instance, the 5 f HCV pseudorevertants were more stable and grew 
to higher titers than the 5'EMCV counterparts, despite the fact that the 5'EMCV RNAs were 
translated more efficiently in vitro. This may indicate that the BVDV and HCV IRESs 
contain signals important for RNA synthesis that are absent in the EMCV IRES. 

It is perhaps not surprising that 5' HCV appeared to recombine with cellular mRNAs 

35 to acquire a 5' terminus with the 5' -(G/A) UAU consensus, given that non-cytopathic strains 
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of BVDV can recombine with BVDV RNA or cellular mRNAs to generate cytopathic strains 
of BVDV (Meyers & Thiel, 1996). Presumably, this recombination event involves template 
switching during negative-strand RNA synthesis, as observed for polio-virus (Kirkegaard & 
Baltimore, 1986). In contrast to 5* HCV, simple deletions of 5' terminal viral sequences could 
5 account for the BVDV + HCVdelBlB2B3 and 5'EMCV pseudorevertants since the 

tetranucleotide sequence is present in these 5' NTRs upstream of functional IRES elements. 
Such deletions could occur by partial degradation of positive-strand template prior to 
negative-strand synthesis, by premature termination during negative-strand RNA synthesis, or 
by degradation of 3' terminal negative-strand sequence after synthesis. It is proposed that 
10 5 f HCV was forced to recombine with cellular sequences because HCV does not have an 5'- 
(G/A) UAU sequence upstream of its IRES. The first occurrence of an (G/A)UAUA 
tetranucleotide sequence is at nt 94-97 within hairpin H2, and a 5* deletion extending into this 
sequence would presumably inactivate or severely impair HCV IRES activity. It is interesting 
that BVDV + HCVdelBlB2B3 and 5'EMCV pseudorevertants were generated at much higher 
1 5 frequency than 5'HCV pseudorevertants. This may indicate that recombination between 

BVDV and cellular RNAs is a rare event compared to the processes which lead to deletion of 
terminal viral sequences. & . 

Poliovirus chimeras dependent upon a functional HCV IRES have been reported (Lu 
& Wimmer, 1996). Interestingly, viable poliovirus chimeras were produced only when HCV 
20 sequences included both the IRES and the N-terminal portion of the HCV ORF. Nucleotide 
sequences or structures in the downstream ORF can modulate HCV IRES translational 
efficiency (see Reynolds et al., 1995; Honda et al., 1996a) but it was also suggested that the 
N-terminal portion of the HCV core polypeptide might be involved. In the case of our 5' 
HCV pseudorevertants, there is no requirement for HCV C protein sequences. Although the 
25 translation efficiency of the HCV IRES in the presence of additional HCV sequences 3' to the 
AUG start was not directly assessed, the HCV chimeras and pseudorevertants were 
translationally active and infectious in the absence of any portion of the HCV ORF. This 
indicates that either the HCV IRES does not extend into the HCV ORF or that the BVDV 
ORF contains analogous sequence which functions in our 5 f HCV chimeras. There is some 
30 limited identity between HCV and BVDV within this region. For example, HCV nt 359-394 
and BVDV nt 405-440 are identical at 21 of 36 positions, although identity within this 
sequence may be attributed to a high adenosine content. It is interesting to note that the 
luciferase (LUC) and chloramphenicol acetyl transferase (CAT) reporter genes previously 
used to detect HCV IRES activity (Tsukiyama-Kohara et al., 1992; Wang et al., 1993) also 
35 have adenosine- or purine-rich regions in relatively the same position as the HCV ORF and 
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BVDV ORF. It this region is indeed important for IRES activity, this may explain why some 
have observed that the HCV IRES does not require a portion of the HCV ORF for translation 

- of CAT or LUC (Tsukiyama-Kohara et al., 1992; Wang et al., 1993). Point mutations and 

insertions within this region of HCV have been shown to reduce HCV IRES activity in vitro 
5 (Honda etal., 1996a,b). 

Despite the fact that BT and Bl are conserved among different strains of BVDV and 
similar hairpins are present in border disease virus and CSFV (Deng & Brock, 1993; Becher 
et al., 1998), Bl' and Bl were dispensable for BVDV replication, provided that the 5' 
tetranucleotide sequence 5'-(G/A)UAU remained. This may indicate a role for BV and Bl in 
1 0 viral replication in vivo that we do not observe in cell culture. It will be interesting to test the 
phenotype of chimeras that lack Bl 1 and Bl in vivo to determine if they are attenuated and 
might serve as useful BVDV vaccines. In this vein, several studies with flaviviruses have 
demonstrated that alterations in 5' NTR or 3* NTR elements can lead to attenuation in vivo 
(Cahour et al., 1995; Men et a., 1996; Mandl et al., 1998). BVDV chimeras that utilize the 
1 5 HCV or EMCV IRES may also prove to be attenuated simply due to the presence of the 

heterologous IRES. For poliovirus, it has been shown that differences in IRES efficiency in 
different host-cell environments can modulate host range and virulence (Shiroki et al., 1997). 

BVDV-HCV chimeras that are dependent on a functional HCV IRES may have 
another practical application. It may be possible to use these chimeras to screen for anti-HCV 
20 therapeutics that target the HCV IRES. Other researchers have shown antisense 

oligonucleotide-mediated inhibition of HCV gene expression in hepatocytes by targeting the 
oligonucleotides to the HCV IRES (Hanecak et al., 1996). It will be of interest to measure the 
efficacy of antisense oligonucleotides orribozymes (Lieber et al., 1996) against replicating 
virus, and these chimeras are more useful than HCV for this purpose since they are able to 
25 replicate efficiently in cell culture. BVDV is believed to be a reasonable model of HCV 

replication not only because of homology and conserved motifs within the 5' NTR but also 
because of similarities in overall genetic organization (Rice, 1996) and polyprotein processing 
strategy (Tautz et al., 1 997; Xu et al., 1997). 

In view of the above, it will be seen that the several advantages of the invention are 
30 achieved and other advantageous results attained. 

As various changes could be made in the above methods and compositions without 
departing from the scope of the invention, it is intended that all matter contained in the above 
description and shown in the accompanying drawings shall be interpreted as illustrative and 
not in a limiting sense. 
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All references cited in this specification, including patents and patent applications, are 
hereby incorporated by reference. The discussion of references herein is intended merely to 
summarize the assertions made by their authors and no admission is made that any reference 
constitutes prior art. Applicants reserve the right to challenge the accuracy and pertinency of 
5 the cited references. 
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What is Claimed is: 

1 . A polynucleotide comprising a chimeric viral RNA which comprises: 

(a) a 5 ' nontranslated region (5 ' NTR); 

(b) an open reading frame (ORF) region; and 
5 (c)a3' nontranslated region (3 ' NTR); 

wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV), and wherein said chimeric viral RNA is replication-competent. 

10 2. The polynucleotide of claim 1 , wherein the chimeric region is the 5 ' NTR and 

the first pestivirus nucleotide sequence is from a bovine viral diarrhea virus (BVDV). 

3. The polynucleotide of claim 2, wherein the BVDV nucleotide sequence is 
located at the 5 ' terminus of the chimeric 5 ' NTR and comprises 5 ' RUAU. 

15 

4. The polynucleotide of claim 3, wherein the first HCV nucleotide sequence in 
the chimeric 5 ' NTR comprises an internal ribosome entry site (IRES). 

5. The polynucleotide of claim 4, wherein the ORF and the 3 ' NTR consist of 
20 second and third BVDV sequences. 

6. The polynucleotide of claim 5, wherein the 5' terminal sequence comprises 5' 

GUAU. 

25 7. The polynucleotide of claim 4, wherein the ORF comprises a second HCV 

sequence encoding at least one structural protein operably linked to a second BVDV 
sequence. 

8. The polynucleotide of claim 1 , wherein the pestivirus is BVDV and the 
30 chimeric region is the 3 ' NTR. 

9. The polynucleotide of claim 8, wherein the first HCV sequence in the 
chimeric 3 ' NTR comprises the HCV 98 bp 3 ' terminal element (SEQ ID NO:X) operably 
linked to the first BVDV sequence. 

35 
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10. A method for identifying compounds having antiviral activity against 
hepatitis C virus (HCV) comprising the steps of: 

(a) providing a first cell containing a chimeric viral RNA which is replication- 
competent in the cell, the chimeric viral nucleic acid comprising a 5' nontranslated region (5' 

5 NTR), an open reading frame (ORF) region; and a 3 ' nontranslated region (3 ' NTR); 

wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV); 

(b) providing a second cell containing the pestivirus; and 

10 (c) comparing the replication efficiency of the chimeric viral RNA acid in the 

presence and absence of a test compound to the replication efficiency of the pestivirus in the 
presence and absence of the test compound, 

wherein a greater reduction in compound-induced replication efficiency of the chimeric viral 
RNA than the pestivirus indicates the compound has anti-HCV activity. 

15 

1 1 . The method of claim 1 0, wherein the chimeric region is the 5 ' NTR and the 
first pestivirus nucleotide sequence is from a bovine viral diarrhea virus (BVDV). 

12. The method of claim 11, wherein the BVDV nucleotide sequence is located 
20 at the 5 ' terminus of the chimeric 5 ' NTR and comprises 5 ' RUAU. 

13. The method of claim 12, wherein the first HCV nucleotide sequence in the 
chimeric 5 ' NTR comprises an internal ribosome entry site (IRES). 

25 14. The method of claim 13, wherein the ORF and the 3' NTR comprise second 

and third sequences from the BVDV. 

15. The method of claim 10, wherein the pestivirus is BVDV and the chimeric 
region is the 3 ' NTR. 

30 

1 6. A genetically-engineered virus comprising a chimeric RNA genome which 
comprises: 

(a) a 5' nontranslated region (5' NTR); 

(b) an open reading frame (ORF) region; and 
35 (c) a 3 ' nontranslated region (3 ' NTR); 
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wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
- virus (HCV), and wherein said chimeric RNA genome is replication-competent. 

5 17. The genetically-engineered virus of claim 1 6, wherein the chimeric region is 

the 5' NTR and the first pestivirus nucleotide sequence is from a bovine viral diarrhea virus 
(BVDV). 

18. The genetically-engineered virus of claim 16, wherein the BVDV nucleotide 
1 0 sequence is located at the 5 ' terminus of the chimeric 5 ' NTR and comprises 5 ' RUAU and 
the first HCV nucleotide sequence in the chimeric 5' NTR comprises an internal ribosome 
entry site (IRES). 



19. A vaccine against bovine viral diarrhea virus (BVDV) comprising an 

15 immunogenically-effective amount of a genetically-engineered virus comprising a chimeric 
RNA genome having: 

(a) a 5 ' nontranslated region (5 ' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3 ' nontranslated region (3' NTR); 

20 wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from BVDV in operable linkage with a first nucleotide sequence from an hepatitis C virus 
(HCV), and wherein the genetically-engineered virus is attenuated as compared to BVDV. 

20. The vaccine of claim 19, wherein the chimeric region is the 5' NTR and the 
25 BVDV nucleotide sequence is located at the 5 ' terminus of the chimeric 5 ' NTR and 

comprises 5 ' RUAU and the first HCV nucleotide sequence in the chimeric 5 ' NTR 
comprises an internal ribosome entry site (IRES). 



21 . A polynucleotide comprising a chimeric viral RNA which comprises: 
30 (a) a 5 ' nontranslated region (5 ' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3 ' nontranslated region (3 ' NTR); 

wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a heterologous nucleotide sequence and wherein 
35 said chimeric viral RNA is replication-competent. 
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pACNR/BVD NADL Xba* -> Graphic Map 

DMA sequence 1S06S bp gtatacgagaat ... cgactcactata circular 

pACNR/BVD NADL-Xba = Haell and Xhol digesc of pACNR/BVD NADL ligated co 

Haell and Xhol digesc of pACNRl 180/DraIII - /BVD5 * 
8/27 corrected r.t 12136 G co C to give Hpal site. 

Co 
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pACNR/BVD NADL-Xba* -> Genes 

DMA sequence 15065 b.p. gtatacgagaac . . . cgactcactata circular 

pACNR/BVD NADL-Xba = Haell and Xhol digest of pACNR/BVD NADL ligated to 

Haell and Xhol digest of pACNR1180/DraIII -/BVD5 1 
8/27 corrected nt 12136 G to C to give Hpal site. 



1 gtatacgagaat tagaaaaggcactcgtatacgtattgggcaattaaaaataataattaggcctagggaacaaatccctc 80 

81 tcagcgaaggccgaaaagaggctagccatgcccttagtaggactagcataatgaggggggtagcaacagtggtgagttcg 160 

161 ttggatggcttaagccctgagtacagggtagtcgtcagtggttcgacgccttggaataaaggtctcgagatgccacgtgg 240 

241 acgagggcatgcccaaagcacatcttaacctgagcgggggtcgcccaggtaaaagcagttttaaccgactgttacgaata 320 

321 cagcctgatagggtgctgcagaggcccactgtattgctactaaaaatctctgctgtacatggcac ATG GAG TTG 394 

I ~ ~ MEL 3 

395 ATC ACA AAT GAA CTT TTA TAC AAA ACA TAG AAA CAA AAA CCC GTC GGG GTG GAG GAA CCT 454 

4XTNELLY KT.YKQKPVGVEEP 23 

455 GTT TAT GAT CAG GCA GGT GAT CCC TTA TTT GGT GAA AGG GGA GCA GTC CAC CCT CAA TCG 514 

24VYDQAGDPLFGERGAVHPQS 43 

515 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TCC TTA 574 

44TLKLPHKRGERDVPTNLASL 63 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TAC CTG 634 

64PKRGD CRSGNSRGPVSG I YL 83 

635 AAG CCA GGG CCA CTA TTT TAC CAG GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CCG CTG 694 

84 K PGPLFYQ DYKG PVYH RA PL 103 

695 GAG CTC TTT GAG GAG GGA TCC ATG "TGT GAA ACG ACT AAA CGC ATA GGG AGA GTA ACT GGA 754 

104 E L F E E-G S M C E T T K R I - G R V T G 123 

755 ACT GAC GGA AAG CTG TAC CAC ATT TAT GTG TGT ATA GAT GGA TGT ATA ATA ATA AAA AGT 814 

124 SDGKLYHIYVCIDGCI I I KS 143 

815 GCC ACG AGA AGT TAC CAA AGG GTG TTC AGG TGG GTC CAT AAT AGG CTT GAC TGC CCT CTA 874 t-h 

144 ATRSYQRVFRWVHNRLDC PL 163 ^ 

875 TGG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAG AAA 934 

164 WVTTCSDTKEEGATKKKTQK 183 *J 



935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AGC 994 

184 PDRLERGKMKIVPKE SEKDS 203 

995 AAA ACT AAA CCT CCG GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AGG AAG 1054 

204 KTKPPDATIVVEGVKYQVRK 223 

1055 AAG GGA AAA ACC AAG AGT AAA AAC ACT CAG GAC GGC TTG TAC CAT AAC AAA AAC AAA CCT 1114 

224 KGKTK SKNTQDG LYHNKN K P 243 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCG TGG GCA ATA ATA GCT ATA GTT 1174 

244 QESRKKLEKALLAWAI X A I V 263 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TGG AAC CTA CAA GAT AAT GGG ACG 1234 

264 LFQVTMGENITQWNLQDNGT 283 

1235 GAA GGG ATA CAA CGG GCA ATG TTC CAA AGG GGT GTG AAT AGA AGT TTA CAT GGA ATC TGG 1294 

284 EG I QR AM FQ RGVNR S L H G I W 303 

1295 CCA GAG AAA ATC TGT ACT GGT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AAA ACA 1354 

304 PEKICTGVPSHLATDI ELKT 323 

1355 ATT CAT GGT ATG ATG GAT GCA AGT GAG AAG ACC AAC TAC ACG TGT TGC AGA CTT CAA CGC 1414 

324 IKGMMDASEKTNYTCCRLQR 343 

1415 CAT GAG TGG AAC AAG CAT GGT TGG TGC AAC TGG TAC AAT ATT GAA CCC TGG ATT CTA GTC 1474 

344 HEWNKHGWCNWYN IEPWILV 363 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC ACT 1534 

364 MN.RTQANLTEG Q PPRE C AVT 383 

1535 TGT AGG TAT GAT AGG GCT AGT GAC TTA AAC GTG GTA ACA CAA GCT AGA GAT AGC CCC ACA 1594 

364 CRYDRASDL NVVTQARD S P. T 403 

1595 CCC TTA ACA GGT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CGG GGC 1654 

404 PLTGCKKGKNFSFAGILM RG 423 



BNSDCCID: <WO 9955366A1 J_> 



WO 99/55366 



PCT/US99/08850 



15/67 



1655 CCC TGC AAC TTT GAA ATA CCT GCA ACT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT ACT 
424 PCNFE IAASDVLFKEHERI S 



1714 
443 



1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GCT GCC 1774 

444 MFQDTTLY LVDCLT NSLEGA 463 

1775 AGA CAA GGA ACC GCT AAA CTC ACA ACC TGC TTA GCC AAG CAG CTC GGG ATA CTA GGA AAA 1834 

464 RQGTAKLTTWLGKQLG1LGK 4 83 

1835 AAG TTG GAA AAC AAG AGT AAG ACC TGC TTT GGA GCA TAC GCT GCT TCC CCT TAC TGT GAT 1894 

484 KLENKSKTWFGAYAASPYCD 503 



1895 GTC GAT CGC AAA ATT GGC TAC ATA TGC TAT ACA AAA AAT TGC ACC CCT GCC TGC TTA CCC 
504 VDRKIGYIWYTKNCTPACLP 



1954 
523 



1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAC ACC AAT GCA GAG GAC GGC AAG ATA 2014 

524 KNTKIVGPGKFDTNAEDGK I 543 

2015 TTA CAT GAG ATG GGC GCT CAC TTG TCC GAG GTA CTA CTA CTT TCT TTA GTG CTC CTG TCC 2074 

544 LHEMGGHLSEVLLLSLVVLS 563 



2075 GAC TTC GCA CCG GAA ACA GCT AGT GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 
564 DFAP ETASVMY LI LH FS I PC 



2134 
583 



2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTG GAG CTG ACA 2194 

584 SHVDVMDCDKTQLNLTVELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TCC GTC TCC AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 TAEVI PGSVWNLGKYVCIRP 623 

2255 AAT TGC TGC CCT TAT GAG ACA ACT GTA GTC TTG GCA TTT GAA GAG GTG ACC CAG GTG GTG 2314 

624 NWWPYETTVVLAFEEVSQVV 643 

2315 AAG TTA GTG TTG AGG GCA CTC AGA GAT TTA ACA CGC ATT TGC AAC GCT GCA ACA ACT ACT 2374 

644 KLVLRALRDLTRIWNAATTT 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA. GTC AGG GGC CAG ATG GTA CAG GGC ATT CTG TGG 2434 

664 AFLVCLVKIVRGQMVQGI L W 683 

2435 CTA CTA TTG ATA ACA GGG GTA CAA GGC CAC TTG GAT TGC AAA CCT GAA TTC TCG TAT GCC 2494 

684 LLLITGVQGH LDCK PEFSYA 703 

2554 
723 



2495 ATA GCA AAG GAC GAA AGA ATT GCT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 
704 IAKDERIGOLGAEGLTTTWK 

2555 GAA TAC TCA CCT GGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 
724 E Y S PGMKLEDTMVI AWCEDG 



2614 
743 



2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMYLQRCTRETRYLA1LHT 763 

2675 AGA GCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CCA AAG CAA GAG GAT 2734 

764 RALPTSVVFKK LFDGRKQED 783 

2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GCA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 

764 VVEMNDNFEFG LC PCDAKP I 803 

2795 GTA AGA GGC AAG TTC AAT ACA ACG CTG CTG AAC GGA CCG GCC TTC CAG ATG GTA TGC CCC 2854 

804 VRGKFNTTL LNG PAFQMVC P 823 

2855 ATA GGA TGG ACA GGG ACT GTA ACC TGT ACG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 

824 IGWTGTVSCTSFNMDTLATT 843 

2915 GTG GTA CGC ACA TAT AGA AGG TCT AAA CCA TTC CCT CAT AGG CAA GCC TGT ATC ACC CAA 2974 

844 VVRTYRRSKPFPHR0G CIT0 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEDLHNC I LGGNWTCV P 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TCT GGC TAT CAA 3094 

684 GDQLLYKGGS I ESCKWCGYQ 903 

3095 TTT AAA GAG AGT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHY P IGKCKLENE 923 

3155 ACT GGT TAC AGG CTA GTA GAC AGT ACC TCT TGC AAT AGA GAA GGT GTC GCC ATA GTA CCA 3214 

924 TGYRLVDSTSCNREGVA1VP 943 

3215 CAA GGG ACA TTA AAG TGC AAG ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATG GAT ACC 3274 

944 QGTLKCKICKTTVQVIAMDT 963 

3275 AAA CTC GGA CCT ATG CCT TGC AGA CCA TAT GAA ATC ATA TCA AGT GAG GGG CCT GTA GAA 3334 

964 KLGPMPCRPYEI I SSEGPVE 983 

3335 AAG ACA GCC TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 

984 KTACTFNYTKT LKNKYFEPR 1003 
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5135 AAA ATC ACC TAC TTT GCG CTG ATG GAT GGA AAG GTG TAT GAT ATC ACA GAG TGG GCT GGA 5194 

1584 KITYFALMDGKVYDITEWAG 1603 

5195 TGC CAG CGT GTG GGA ATC TCC CCA GAT ACC CAC AGA GTC CCT TGT CAC ATC TCA TTT GGT 5254 

1604 CORVGIS PDTHRVPCHISPG 1623 

5">55 TCA CGG ATG CCT TTC AGG CAG GAA TAC AAT GGC TTT GTA CAA TAT ACC GCT AGG GGG CAA 5314 

1624 SRMPPROEVNGFVOYTARGO 1643 

5315 CTA TTT CTG AGA AAC TTG CCC GTA CTG GCA ACT AAA GTA AAA ATG. CTC ATG GTA GGC AAC 5374 

1644 LF*LRNLPVLATKVKMLMVGN 1663 

5375 CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT CTT GGG TGG ATC CTA AGG GGG CCT GCC GTG 5434 

1664 LGEEI GNLEHLGWI LRGPAV 1683 

54 35 TGT AAG AAG ATC ACA GAG CAC GAA AAA TGC CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA 5494 

1684 CKKITEHEKCH I NI LDKLTA 1703 

5495 TTT TTC GGG ATC ATG CCA AGG GGG ACT ACA CCC AGA GCC CCG GTG AGG TTC CCT ACG AGC 5554 

1704 FFGI M PRGTT PRAPVRF PTS 1723 

5555 TTA CTA AAA GTG AGG AGG GGT CTG GAG ACT GCC TGG GCT TAC ACA CAC CAA GGC GGG ATA 5614 

1724 LLKVRRGLETAWAYTHQGGI 1743 

5615 AGT TCA GTC GAC CAT GTA ACC GCC GGA AAA GAT CTA CTG GTC TGT GAC AGC ATG GGA CCA 5674 

1744 SSVDHVTAGKDLLVCDSMGR 1763 

5675 ACT AGA GTG CTT TGC CAA AGC AAC AAC AGG TTG ACC GAT GAG ACA GAG TAT GGC GTC AAG 5734 

1764 TRVVCQSNNR LTDETEYGVK 1783 

5735 ACT GAC TCA GGG TGC CCA GAC GGT GCC AGA TGT TAT GTG TTA AAT CCA GAG GCC GTT AAC 5794 

1784 TDSGCPDGARCYVLNPEAVN 1803 

5795 ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC CTC CAA AAG ACA GGT GGA GAA TTC ACG TGT 5854 
1804 I SGSKGAVVH LQKTGGEFTC 1823 

5855 GTC ACC GCA TCA GGC ACA CCG GCT TTC TTC GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC 5914 
1824 VTASGTPAFF DLKNLKGWSG 1843 

5915 TTG CCT ATA TTT GAA GCC TCC AGC GGG AGG GTG GTT GGC AGA GTC AAA GTA GGG AAG AAT 5974 
1844 LPIFEASSGRVVGRVKVGKN 1863 

5975 GAA GAG TCT AAA CCT ACA AAA ATA ATG AGT GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA 6034 
1864 EESKPTKIMSGIQTVSKNRA 1883 

6035 GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT 6094 
1884 DL TEMVKKITSMNRGDFKQI 1903 

6095 ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA 6154 
1904 TLATGAGKTTELPKAVIEEI 1923 

6155 GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA CCA TTA AGG GCA GCG GCA GAG TCA GTC TAC 6214 
HKRVLVLI PLRAAAESVY 1943 



1924 G R H K R 

6215 CAG TAT ATG AGA TTG AAA CAC CCA AGC ATC TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA 
1944 QYMRLKHPSI SFNLR1GDMK 

6275 GAG GGG GAC ATG GCA ACC GGG ATA ACC TAT GCA TCA TAC GGG TAC TTC TGC CAA ATG CCT 
1964 EGDMATGITYASYGYFCQMP 

6335 CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT 
1984 OPKLRAAMVEYSYIFLDEYH 

6395 TGT GCC ACT CCT GAA CAA CTG GCA ATT ATC GGG AAG ATC CAC AGA TTT TCA GAG AGT ATA 
2004 CATPEQLAIIGKIHRFSESI 

6455 AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA GGG TCC CTG ACC ACA ACA GGT CAA AAG CAC 
2024 RVVAMTATPAGSVTTTGQKH 

6515 CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA ATG AAA GGG GAG GAT CTT GGT AGT CAG TTC 
2044 piEEFIAPEVMKGEDLGSOF 

6575 CTT GAT ATA GCA GGG TTA AAA ATA CCA GTG GAT GAG ATG AAA GGC AAT ATG TTG GTT TTT 
2064 LDIAGLKIPVDEMKGNMLVF 
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8615 AGG ACA GCA GGC AGA AAC TTA TTC ACA TTG ATA ATC TTT GAA GCC TTC GAG TTA TTA GGG 8674 
2744 RTAGRNLFTLI MFEAFELLG 2763 

8675 ATG GAC TCA CAA GGG AAA ATA AGG AAC CTG TCC GGA AAT TAC ATT TTG GAT TTG ATA TAC 8734 
2764 MDSQGK I RNLSGNY I LD L I Y 2783 

8735 GGC CTA CAC AAG CAA ATC AAC AGA GGG CTG AAG AAA ATG GTA CTG GGG TGG GCC CCT GCA 8794 
2784 GLKK Q1NRGLKKMVLGWAPA 2803 

8795 CCC TTT AGT TGT GAC TGG ACC CCT ACT GAC GAG AGG ATC AGA TTG CCA ACA GAC AAC TAT 8854 
2804 PFSCDWTPSDERI RLPTDNY 2823 

8855 TTG AGG GTA GAA ACC AGG TGC CCA TGT GGC TAT GAG ATG AAA GCT TTC AAA AAT GTA GGT 8914 
2824 LRVETRCPCGYEMKAFKNVG 2843 

8915 GGC AAA CTT ACC AAA GTG GAG GAG AGC GGG CCT TTC CTA TGT AGA AAC AGA CCT GGT AGG 8974 
2844 GKLTKVEESGPFLC.RNRPGR 2863 

8975 GGA CCA GTC AAC TAC AGA GTC ACC AAG TAT TAC GAT GAC AAC CTC AGA GAG ATA AAA CCA 9034 
2864 GPVNYRVTKYY DDNLRE I KP 2883 

9035 GTA GCA AAG TTG GAA GGA CAG GTA GAG CAC TAC TAC AAA GGG GTC ACA GCA AAA ATT GAC 9094 
2884 VAK LEGQVEHY YKGVTAK I D 2903 

9095 TAC AGT AAA GGA AAA ATG CTC TTG GCC ACT GAC AAG TGG GAG GTG GAA CAT GGT GTC ATA 9154 
2904 YSKGKMLLATDKWEV EHGV I 2923 

9155 ACC AGG TTA GCT AAG AGA TAT ACT GGG GTC GGG TTC AAT GGT GCA TAC TTA GGT GAC GAG 9214 
2924 TRLAKRYTGVGFNGAYLGDE 2943 

9215 CCC AAT CAC CCT GCT CTA GTG GAG AGG GAC TGT GCA ACT ATA ACC AAA AAC ACA GTA CAG 9274 
2944 PNHRALVERDCAT ITKNTVQ 2963 

9275 TTT CTA AAA ATG AAG AAG GGG TGT GCG TTC ACC TAT GAC CTG ACC ATC TCC AAT CTG ACC 9334 
2964 FLKMKKGCAFTYDLTISNLT 2983 

9335 AGG CTC ATC GAA CTA GTA CAC AGG AAC AAT CTT GAA GAG AAG GAA ATA CCC ACC GCT ACG 9394 
2984 RLIELVHRNNL.EEKEIPTAT 3003 

9395 GTC ACC ACA TGG CTA GCT TAC ACC TTC GTG AAT GAA GAC GTA GGG ACT ATA AAA CCA GTA 9454 
3004 VTTWLAYTFVNEDVGTI KPV 3023 

9455 CTA GGA GAG AGA GTA ATC CCC GAC CCT GTA GTT GAT ATC AAT TTA CAA CCA GAG GTG CAA 9514 
3024 LGERVIPDPVVDINLQPEVQ 3043 

9515 GTG GAC ACG TCA GAG GTT GGG ATC ACA ATA ATT GGA AGG GAA ACC CTG ATG ACA ACG GGA 9574 
3044 VDTSEVGITI IGRETLMTTG 3063 

9575 GTG ACA CCT GTC TTG GAA AAA GTA GAG CCT GAC GCC AGC GAC AAC CAA AAC TCG GTG AAG 9634 
3064 VTPVLEKVEPDASDNQNSVK 3083 

9635 ATC GGG TTG GAT GAG GGT AAT TAC CCA GGG CCT GGA ATA CAG ACA CAT ACA CTA ACA GAA 9694 
3084 IGLDEGNYPGPG I QTHTLTE 3103 

9695 GAA ATA CAC AAC AGG GAT GCG AGG CCC TTC ATC ATG ATC CTG GGC TCA AGG AAT TCC ATA 9754 
3104 EIHNRDARPFIMI LGSRNSI 3123 

9755 TCA AAT AGG GCA AAG ACT GCT AGA AAT ATA AAT CTG TAC ACA GGA AAT GAC CCC AGG GAA 9814 
3124 SNRAKTARNI NLYTGNDPRE 3143 

9815 ATA CGA GAC TTG ATG GCT GCA GGG CGC ATG TTA GTA GTA GCA CTG AGG GAT GTC GAC CCT 9874 
3144 IRDLMAAGRMLVVALRDVDP 3163 

9875 GAG CTG TCT GAA ATG GTC GAT TTC AAG GGG ACT TTT TTA GAT AGG GAG GCC CTG GAG GCT 9934 
3164 ELSEMVDFKGTFLDREALEA 3183 

9935 CTA AGT CTC GGG CAA CCT AAA CCG AAG CAG GTT ACC AAG GAA GCT GTT AGG AAT TTG ATA 9994 
3184 LSLGQPKPKQVTKEAVRNL1 3203 

9995 GAA CAG AAA AAA GAT GTG GAG ATC CCT AAC TGG TTT GCA TCA GAT GAC CCA GTA TTT CTG 10054 
3204 EOKKDVEI PNWFASDDPVFL 3223 

10055 GAA GTG GCC TTA AAA AAT GAT AAG TAC TAC TTA GTA GGA GAT GTT GGA GAG CTA AAA GAT 10114 
3224 EVALKNDKYY LVGDVGELKD 3243 

10115 CAA GCT AAA GCA CTT GGG GCC ACG GAT CAG ACA AGA ATT ATA AAG GAG GTA GGC TCA AGG 10174 
3244 QAKALGATDQTRI IKEVGSR 3263 

10175 ACG TAT GCC ATG AAG CTA TCT AGC TGG TTC CTC AAG GCA TCA AAC AAA CAG ATG AGT TTA 10234 
3264 TYAMKLSSWFLKASNKOMSL 3283 

10235 ACT CCA CTG TTT GAG GAA TTG TTG CTA COG TGC CCA CCT GCA ACT AAG AGC AAT AAG GGG 10294 
3284 TPLFEELLL.RCP PATKSNKG 3303 

10295 CAC ATG GCA TCA GCT TAC CAA TTG GCA CAG GGT AAC TGG GAG CCC CTC GGT TGC GGG GTG 10354 
3304 HMASAYQLAQCNWEPLGCCV 3323 
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1209S TGT GTT GCC ATT OGG AAA GAA GAG GGC AAC TGG CTA GTT AAC GCC GAC AGG CTG ATA TCC 12154 
3904 CVAICKEEGNWLVNADRLI S 3923 

12155 AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT 12214 
3924 SKTGHLYI PDKGFTLQGKHY. 3943 

12215 GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC CCG GTC ATG GGG GTT GGG ACT GAG AGA TAC 12274 
3944 EQL.QLRTETNPVMGVGTERY 3963 

12275 AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG AGA AGG TTG AAA ATT CTG CTC ATG ACG GCC 12334 
3964 KLGPIVNLLLRRLKI LLMTA 3983 

12335 GTC GGC GTC AGC AGC TGA gacaaaacgtatatactgtaaataaactaatccatgtacacagtgtatataaatat 12408 
3984 V G V S S 3989 

12409 agctgggaccgcccacctcaagaagacgacacgcccaacacgcacagctaaacagtagtcaagattatctacctcaagat 12488 

124 89 aacactacacctaacgcacacagcaccccagccgcacgaggacacgcccgacgcccacagccggaccagggaagacccct 12568 

12569 aacagccccccgcaggttaattaactagtgggaatacgcggggtatgccgcgtttcagcatattgacgacccaactctca 12648 

12649 cgttcgacagcccaccaccgccgagcaagacgtcccccgtcgaatatggctcacaacacccctcgcaccaccgtttatgc 12728 

12729 aagcagacagcttcactgtccatgacgatacatttttatcccgcgcaacgcaacaccagagatcctgagacacgtggcct 12808 

12809 tgttgaataaatcgaacttttgctgagttgaaggatcagatcacgcaccttcccgacaacgcagaccgttccgtggcaaa 12888 

12889 gcaaaagcccaaaatcaccaactggtccacctacaacaaagctctcaccaaccgtggctccctcactttctggctggacg 12968 

12969 atggggcgattcaggcctggtatgagtcagcaacaccttcttcacgaggcagacctcagcgctagcggagtgtatactgg 13048 

" ' " i.. 
13049 cttactatgttggcactgatgagggtgtcagcgaagcgctccatgtggcaggagaaaaaaggctgcaccggtgcgtcagc 13128 

13129 agaatatgcgatacaggatatattccgcttcctcgctcactgactcgctacgctcggtcgttcgaccgcggcgagcggaa 13208 

13209 atggctcacgaacggggcggagattccctggaagatgccaggaagatacttaacagggaagtgagagggccgcggcaaag 13288 

13289 ccgtttttccacaggctccgcccccccgacaagcatcacgaaatccgacgcccaaaccagtggtggcgaaacccgacagg 13368 00 

13369 accataaagataccaggcgccccccccggcggcccccccgcgcgctcccctgcccccgccccccggcctaccggtgccac 13448 ^ 

13449 cccgccgtcatggccgcgcccgcctcaccccacgcccgacacccagctccgggcaggcagtccgccccaagccggaccgc 13528 fej 

13529 acgcacgaaccccccgcccagcccgaccgctgcgccctatccggtaaccaccgtctcgagcccaacccggaaagacatgc 13608 

13609 aaaagcaccaccggcagcagccaccggtaaccgacttagaggagtcagcctcgaagtcacgcgccggccaaggccaaacc 13688 

13689 gaaaggacaagctttggcgactgcgcccccccaagccagtcaccccggctcaaagagctggtagcccagagaaccctcga 13768 

13769 aaaaccgccccgcaaggcggctcttccgcttccagagcaagagattacgcgcagaccaaaacgacctcaagaagaccatc 13848 

1384 9 ccatcaaggggtccgacgcccagcggaacgaaaacccacgctaagggaccctggtcatgagaccaccaaaaaggaccctc 13928 

13929 acccagatccctttaaactaaaaaLgaagtcctaaaccaatccaaagcacatatgagcaaacccggtctgacagccacca 14008 

14009 acgcctaaccagcgaggcacccatctcagcgacctgcctacctcgttcatccatagctgcctgaccccccgtcgcgcaga 14088 

14089 taaccacgacacgggagggctcaccacctggccccagcgctgcaacgataccgcgagacccacgcccaccggctccagat 14168 

14169 tcatcagcaataaaccagccagccggaagggccgagcgcagaagcggccctgcaactccacccgcccccacccagcctat 14248 

14249 taaccgtcgccgggaagctagagtaagcagctcgccagccaatagtccgcgcaacgtcgccgccattgccgcaggcaccg 14328 

14 329 tggcgccacgcccgccgtttggtatggccccattcagctccggtccccaacgatcaaggcgagccacatgaccccccacg 14408 

14409 ccgcgcaaaaaagcggccagctccctcggccctccgaccgttgtcagaagcaagccggccgcagcgtcaccacccacggt 14488 

14 489 catggcagcaccgcataaccccctcaccgtcacgccacccgtaagacgcccttccgcgaccggtgagcacccaaccaagt 14568 

14569 caccccgagaacagcgtatgcggcgaccgagctgcccctgcccggcgtcaacacgggacaacaccgcgccacacagcaga 14648 

14 649 accccaaaagcgcccaccaccggaaaacgccccccggggcgaaaacccccaaggacctcaccgccgtcgagatccagccc 14728 

14729 gatgtaacccacccgcgcacccaaccgaccctcagcaccccccacLttcaccagcgcctctgggcgagcaaaaacaggaa 14808 

14809 ggcaaaacgccgcaaaaaagggaacaagggcgacacggaaacgctgaatactcacacccccccttctccaacatcattga 14888 

14889 agcatttiatcagggttatcgtctcatgagcggatacacacccgaacgcacccagaaaaataaacaaacaggggctccgcg 14968 

14969 cacactcccccgaaaagcgccacccgacgccgacccgaggtaactacaacccgggccctacatacggacccaaccccaga 15048 

15049 taacacgactcaccaca 15065 
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BVDV NADL (inf. clone) -> Genes 

CNA sequence 12578 b.p. gtatacgagaat ... ctaacagccccc linear 

1 gcacacgagaattagaaaaggcacccgcatacgtattgggcaactaaaaacaacaattaggcctagggaacaaacccccc 80 

81 tcagcgaaggccgaaaagaggctagccacgcccttagtaggactagcacaatgaggggggtagcaacagtggt gage teg 160 

161 ttggacggcccaagccccgagtacagggtagtcgtcagcggctcgacgccttggaacaaaggcctcgagatgccacgtgg 240 

241 acgagggcatgcccaaagcacatcccaacctgagcgggggccgcccaggtaaaagcagttttaaccgacLgttacgaata 320 



4ITNELLYKTY KQKPVGVEE 

455 GTT TAT GAT CAG GCA GGT GAT CCC TTA TTT GGT GAA AGG GGA GCA GTC CAC CCT CA 
24VYDQAGDPLFGERGAVHP.Q 

515 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TO 
44 T LK L P H K R GE RD V P TN L A S 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TA 
64PKRGDCRSGNSRGPVSG IY 

635 AAG CCA GGG CCA CTA TTT TAC CAG GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CO 
84 K PG P LFY QDYKG PVYHRA P 

695 GAG CTC TTT GAG GAG GGA TCC ATG TGT GAA ACG ACT AAA CGG ATA GGG AGA GTA AC 
104 ELFEEGSMCETTKRIGRVT 

755 ACT GAC GGA AAG CTG TAC CAC ATT TAT GTG TGT ATA GAT GGA TGT ATA ATA ATA AA 
124 SDGKLYHIYV .CIDGCI I IK 

815 GCC ACG AGA ACT TAC CAA AGG GTG TTC AGG TGG GTC CAT AAT AGG CTT GAC TGC CC 
144 A T R S Y Q R V F R WV H N R. L D C P 

875 TGG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CA 
164 WVTTC SDTKEEGATKKKTQ 

935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GA 
184 P D R LERGKMKIVPKE SEKD 

995 AAA ACT AAA CCT CCG GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AG 
204 K T K P PDAT IVVEGVKYQVR 

1055 AAG GGA AAA ACC AAG ACT AAA AAC ACT CAG GAC GGC TTG TAC CAT AAC AAA AAC AA 
224 KGKTKSKNT. QD. GLYHNKNK 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCG TGG GCA ATA ATA GCT AT 
244 OESRKKLEKALLAWAI I A I 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TGG AAC CTA CAA GAT AAT GG 
264 LFQVTMG E N I TQWNLQDNG 

1235 GAA GGG ATA CAA CGG GCA ATG TTC CAA AGG GGT GTG AAT AGA ACT TTA CAT GGA AT 
284 E G I QRAMFQRGVNRSLHGI 

1295 CCA GAG AAA ATC TGT ACT GGT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AA 
304 PEK ICTGVPSHLATDI ELK 

1355 ATT CAT GGT ATG ATG GAT GCA AGT GAG AAG ACC AAC TAC ACG TGT TGC AGA CTT CA 
324 I HGMMDASEKTNYTCCR L. Q 

1415 CAT GAG TGG AAC AAG CAT GGT TGG TGC AAC TGG TAC AAT ATT GAA CCC TGG ATT CT 
344 HEWNKHGWCNWYNIEPWI L 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GT 
364 MNRTQANLTEGQPPRECAV 

1535 TGT AGG TAT GAT AGG GCT AGT GAC TTA AAC GTG GTA ACA CAA GCT AGA GAT AGC CC 
384 CRYDRASDLNVVTQARD SP 

1595 CCC TTA ACA GGT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CG 
404 PLTGCKK GKNFSFAG I LM R 

1655 CCC TGC AAC TTT GAA ATA GCT GCA AGT GAT GTA TTA TTC AAA GAA CAT GAA CGC AT 
424 PCKFE I AASDVLFKEH E R I 

1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GG 
444 MFODTTL.Y LVDGLTNS L EG 
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177S AGA CAA GGA ACC GCT AAA CTG ACA ACC TGG TTA GGC AAG CAG CTC GGG ATA CTA GGA AAA 1834 

464 RQGTAKLTTWLGKQLCI L G K 483 

1835 AAG TTG GAA AAC AAG AGT AAG ACG TGG TTT GGA GCA TAC GCT GCT TCC CCT TAC TGT GAT 1894 

484 KLENKSK .TWFGAYAASPYCD 503 

1895 GTC GAT CGC AAA ATT GGC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TGC TTA CCC 1954 

504 VDRK I GY IWYTKNCT PACL P 523 

1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAC ACC AAT GCA GAG GAC GGC AAG ATA 2014 

524 KNTK I VG PGK FDTNA £ D G K I 543 

2015 TTA CAT GAG ATG GGG GGT CAC TTG TCC GAG GTA CTA CTA CTT TCT TTA CTC GTC CTG TCC 2074 

544 LHEMGGH LSEV LLLS L V V LS 563 

2075 GAC TTC GCA CCG GAA ACA GCT AGT GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 DFAPETASVMYLILHFSI PO 583 

2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTC GAG CTG ACA 2194 

584 SHVDVMDCDKTQLNLTVELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TCC GTC TGG AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 TAEVI PGSVWNLGKYVCI RP 623 

2255 AAT TGG TGG CCT TAT GAG ACA ACT GTA GTC TTG GCA TTT GAA GAG GTG AGC CAG GTG GTG 2314 

624 NWWPY ETTVV LAFE E V S Q V V 643 

2315 AAG TTA GTG TTG AGG GCA CTC AGA GAT TTA ACA CGC ATT TGG AAC GCT GCA ACA ACT ACT 2374 

644 KLVLRALRDLTRIWNAATTT 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA GTC AGG GGC CAG ATG GTA CAG GGC ATT CTG TGG 2434 

664 AFLVCLVKIVRGQMVQGI LW 683 

2435 CTA CTA TTG ATA ACA GGG GTA CAA GGG CAC TTG GAT TGC AAA CCT GAA TTC TCC TAT GCC 2494 

684 LLLITGVQGHLDCKP EFSYA 703 

2495 ATA GCA AAG GAC GAA AGA ATT GGT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 IAKDERI GO LG AEG LTTTWK 723 

2555 GAA TAC TCA CCT GGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 2614 

724 EYSPGMKLEDTMVI AWCEDG 743 

2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMYLQRCTR*ETRY LA I LHT 763 

2675 AGA GCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CGA AAG CAA GAG GAT 2734 

764 RALPTSVVFKKLFDGRKQED 783 

2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 

784 VVEMNDNFE FG LCPC DAK P I 803 

2795 GTA AGA GGG AAG TTC AAT ACA ACG CTG CTG AAC GGA CCG GCC TTC CAG ATG GTA TGC CCC 2854 

804 VRGKFNTTLLNGPAFQ/MVC P 823 

2855 ATA GGA TGG ACA GGG ACT GTA AGC TGT ACG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 
824 ICWTGTVSCTSFNMDTLATT 843 

2915 GTG GTA CGC ACA TAT AGA AGG TCT AAA CCA TTC CCT CAT AGG CAA GGC TGT ATC ACC CAA 2974 
844 VVR TYRRSKPFPHRQGCITQ 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 
864 KNLGEDLHNC I LGGNWTCVP 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TGT GGC TAT CAA 3094 
884 GDQLLYKGG S I ESCKWCGYQ 903 

3095 TTT AAA GAG AGT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 
904 FKESEGLPHY PIGKCKLENE 923 

3155 ACT GCT TAC AGG CTA GTA GAC AGT ACC TCT TGC AAT AGA GAA GGT GTG GCC ATA GTA CCA 3214 
924 TCYRLVDSTSCNREGVA I VP 943 

3215 CAA GGG ACA TTA AAG TGC AAG ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATG GAT ACC 3274 
944 QGTLKCK I G KTTVQV I AMDT 963 

3275 AAA CTC GGA CCT ATG CCT TGC AGA CCA TAT GAA ATC ATA TCA AGT GAG GGG CCT GTA GAA 3334 
964 KLGPMPCRPYE I I SSEGPVE 983 

3335 AAG ACA GCG TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 
984 KTACTFNYTKTLKNKYFE PR 1003 

3395 GAC AGC TAC TTT CAG CAA TAC ATG CTA AAA GGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 3454 
1004 DSYFOOYMLKGEYQYWFDLE 1023 

3455 GTG ACT GAC CAT CAC CGC GAT TAC TTC GCT GAG TCC ATA TTA GTG GTG GTA GTA GCC CTC 3514 
1024 VTDHHRDYFAESI LVVVVAL 1043 
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3 515 TTG GGT GGC AGA TAT GTA CTT TGG TTA CTG GTT ACA TAC 
1044 LGGRYVLWLLVTY 

3575 GCC TTA GGG ATT CAG TAT GGA TCA GGG GAA GTG GTG ATG 
1064 ALG I QYGSGEVVM 

3635 AAC AAT ATT GAA GTG GTG ACA TAC TTC TTG CTG CTG TAC 
1084 NNIEVVTYFLLLY 

3695 GTA AAG AAG TGG GTC TTA CTC TTA TAC CAC ATC TTA GTG 
1104 VKKWVLLLYH ILV 

3755 ATT GTG ATC CTA CTG ATG ATT GGG GAT GTG GTA AAG GCC 
1124 IVI LLMIGDVVKA 

3815 TTG GGG AAA ATA GAC CTC TGT TTT ACA ACA GTA GTA CTA 
1144 LGK I DLCFTTVVL 

3875 GCC AGG CGT GAC CCA ACT ATA GTG CCA CTG GTA ACA ATA 
1164 ARRDPTIVPLVTI 

3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC GCT GTG GCG 
1184 ELTHQPGVDI AVA 

3995 ATG GTT AGC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA 
1204 MVSYVTDYFRYKK 

4055 CTG GTA TCT GCG GTG TTC TTG ATA AGA AGC CTA ATA TAC 
1224 LVSAVFLIRSLIY 

4115 GAG GTA ACT ATC CCA AAC TGG AGA CCA CTA ACT TTA ATA 
1244 EVTI PNWRPLTLI 

4175 ACA ATT GTA ACG AGG TGG AAG GTT GAC GTG GCT GGC CTA 
1264 TIVTRWKVDVAGL 

4235 TTA TTG CTG GTC ACA ACC TTG TGG GCC GAC TTC TTA ACC 
1284 LLLVTTLWADFLT 
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I 1283 



ACC 4294 
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4295 TAT GAA TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AGG ACT GAT 
1304 YELVKLYY LKTVRTD 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT 
1324 LGGI DYTRVDSIYDV 

4415 GGC GTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GGG AAT TTT 
1344 GVYLFPSRQKAQGNF 

4475 CTT ATC AAA GCA ACA CTG ATA ACT TGC GTC AGC AGT AAA TGG CAG 
1364 LIKATLISCVSSKWQ 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AGG AAA GTT ATA 
1384 YLTLDFMYYMHRKVI 

4595 GGT ACC AAC ATA ATA TCC AGG TTA GTG GCA GCA CTC ATA GAG CTG 
1404 GTNI I SRLVAALI EL 

4655 GAA GAG GAG AGC AAA GGC TTA AAG AAG TTT TAT CTA TTG TCT GGA 
1424 EEESKGLKKFYLLSG 

4715 ATA ATA AAA CAT AAG GTA AGG AAT GAG ACC GTG GCT TCT TGG TAC 
1444 IIKHKVRNETVASWY 

4775 TAC GGT ATG CCA AAG ATC ATG ACT ATA ATC AAG GCC AGT ACA CTG 
1464 YGMPKIMTI IKASTL 

4835 TGC ATA ATA TGC ACT GTA TGT GAG GGC CGA GAG TGG AAA GGT GGC 
1484 C I I CTVC E G R E WKGG 

4895 GGA CGC CAT GGG AAG CCG ATA ACG TGT GGG ATG TOG CTA GCA GAT 
1504 GRHGKPI TCGMSLAD 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GGC AAC TTT GAG GGT ATG TGC 
1524 YKRI F IREGNFEGMC 

5015 AAG CAT AGG AGG TTT GAA ATG GAC CGG GAA CCT AAG AGT GCC AGA 
1544 KHRRFEMDREPKSAR 
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GGA 5014 
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TGT 5074 
C 1563 



3 



5075 AAT AGG CTG CAT CCT GCT GAG GAA GGT GAC TTT TGG GCA GAG TCG AGC ATG 
1564 NRLHPAEEGDFWAESSM 

5135 AAA ATC ACC TAC TTT GCG CTG ATG GAT GGA AAG GTG TAT GAT ATC ACA GAG 
1584 KITYFALMDGKVYDITE 

5195 TGC CAG CGT GTG GGA ATC TCC CCA GAT ACC CAC AGA GTC CCT TGT CAC ATC 
1604 CQR VG I S P DT HR V P C H I 



TTG GGC 
L G 



TGG GCT 
W A 



TCA TTT 

S F 



CTC 5134 
L 1583 



GGA 5194 
G 1603 



GGT 5254 
G 1623 
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5255 TCA CGG ATG CCT TTC AGG CAG GAA TAC AAT GGC TTT GTA CAA TAT ACC GCT AGG GGG CAA 5314 

1624 SRMPFROEYNG FVQYTARGQ 1643 

5315 CTA TTT CTG AGA AAC TTC CCC GTA CTG GCA ACT AAA GTA AAA ATG CTC ATG GTA GGC AAC 5374 

1644 LFLRNLPVLAT KVKMLMVGN 1663 

5375 CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT CTT GGG TGG ATC CTA AGG GGG CCT GCC GTG 5434 

1664 LGEEIGNLEH L G W I LRGPAV 1683 

5435 TGT AAG AAG ATC ACA GAG CAC GAA AAA TGC CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA 5494 

1684 CKKITEHEKCH INILDKLTA 1703 

5495 TTT TTC GGG ATC ATG CCA AGG GGG ACT ACA CCC AGA GCC CCG GTG AGG TTC CCT ACG AGC 5554 

1704 FFGIMPRGTTPRAPVRFPTS 1723 

5555 TTA CTA AAA GTG AGG AGG GGT CTG GAG ACT GCC TGG GCT TAC ACA CAC CAA GGC GGG ATA 5614 

1724 LLKVRRGLETAWAYTHQGG I 1743 

5615 AGT TCA GTC GAC CAT GTA ACC GCC GGA AAA GAT CTA CTG GTC TGT GAC AGC ATG GGA CGA 5674 

1744 SSVDHVT AGKDLLVCDSMGR 1763 

5675 ACT AGA GTG GTT TGC CAA AGC AAC AAC AGG TTG ACC GAT GAG ACA GAG TAT GGC GTC AAG 5734 

1764 TRVVCQSNNR LTDETEYGVK 1783 

5735 ACT GAC TCA GGG TGC CCA GAC GGT GCC AGA TGT TAT GTG TTA AAT CCA GAG GCC GTT AAC 5794 

1784 TDSGC PDGARCYVLNPEAVN 1803 

5795 ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC CTC CAA AAG ACA GGT GGA GAA TTC ACG TGT 5854 

1804 I SGSKGAVVH LQKTGGEFTC 1823 

5855 GTC ACC GCA TCA GGC ACA CCG GCT TTC TTC GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC 5914 

1824 VTASGTPAFFDLKNLKGWSG 1843 

5915 TTG CCT ATA TTT GAA GCC TCC AGC GGG AGG GTG GTT GGC AGA GTC AAA GTA GGG AAG AAT 5974 

1844 LPI FEASSGRVVGRVKVGKN 1863 

5975 GAA GAG TCT AAA CCT ACA AAA ATA ATG AGT GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA 6034 

1864 EESKPTKIMSGIQTVSKNRA 1883 ^ 

6035 GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT 6094 

1884 DLTEMVKKITSMNRGDFKQI 1903 ^ 

6095 ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA 6154 S 

1904 T LATGAGKTT E L PKAVI E E I 1923 ^ 

6155 GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA CCA TTA AGG GCA GCG GCA GAG TCA GTC TAC . 6214 O 

1924 GRHKRVLVLI PLRAAAESVY 1943 g 

6215 CAG TAT ATG AGA TTG AAA CAC CCA AGC ATC TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA 6274 

1944 QYMRLKHPSI SFNLRIGDM K 1963 

6275 GAG GGG GAC ATG GCA ACC GGG ATA ACC TAT GCA TCA TAC GGG TAC TTC TGC CAA ATG CCT 6334 

1964 EGDMATG I TY A S YGY FC Q M P 1983 

6335 CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT 6394 

1984 0 P K LRAAMVE Y S Y I F LDEY H 2003 

6395 TGT GCC ACT CCT GAA CAA CTG GCA ATT ATC GGG AAG ATC CAC AGA TTT TCA GAG AGT ATA 6454 

2004 C ATPEQLAI I GK I HRFS E S I 2023 

6455 AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA GGG TOG GTG ACC ACA ACA GGT CAA AAG CAC 6514 
2024 RVVAMTATPAGSVTTTGQKH 2043 

6515 CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA ATG AAA GGG GAG GAT CTT GGT AGT CAG TTC 6574 
2044 PI EEFIAPEVMKGEDLGSQF 2063 

6575 CTT GAT ATA GCA GGG TTA AAA ATA CCA GTG GAT GAG ATG AAA GGC AAT ATG TTG GTT TTT 6634 
2064 LDIAGLKI PVDEMKGNMLVF 2083 

6635 GTA CCA ACG AGA AAC ATG GCA GTA GAG GTA GCA AAG AAG CTA AAA GCT AAG GGC TAT AAC 6694 
2084 VPTRNMAVEVAKKLKAKGYN 2103 

6695 TCT GGA TAC TAT TAC AGT GGA GAG GAT CCA GCC AAT CTG AGA GTT GTG ACA TCA CAA TCC 6754 
2104 SGYYYSGEDPANLRVVTSQS 2123 

6755 CCC TAT GTA ATC GTG GCT ACA AAT GCT ATT GAA TCA GGA GTG ACA CTA CCA GAT TTG GAC 6814 
2124 PYVIVATNAI ESGVTLPDLD 2143 

6815 ACG GTT ATA GAC ACG GGG TTG AAA TGT GAA AAG AGG GTG AGG GTA TCA TCA AAG ATA CCC 6874 
2144 TVIDTGLKCEKRVRVSSKI P 2163 

6875 TTC ATC GTA ACA GGC CTT AAG AGG ATG GCC GTG ACT GTG GGT GAG CAG GCG CAG CCT AGG 6934 
2164 FIVTCLKRMAVTVCEQAQRR 2183 

6935 GGC AGA GTA GGT AGA GTG AAA CCC GGG AGG TAT TAT AGG AGC CAG GAA ACA GCA ACA GGG 6994 
2184 GRVGRVK PGR Y Y RSQET ATG 2203 
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TAC 

Y 


9095 
2904 


TAC 

y 


AGT 
S 


AAA 

K 


GGA 
G 


AAA 

K 


ATG 
M 


CTC 
L 


TTG 
L 


GCC 
A 


ACT 
T 


GAC 
D 


9155 
2924 


ACC 
T 


AGG 
R 


TTA 

L 


GCT 
A 


AAG 

K 


AGA 
R 


TAT 

Y 


ACT 
T 


GGG 

G 


GTC 
V 


GGG 

G 


9215 
2944 


CCC 

p 


AAT 
N 


CAC 
H 


CGT GCT 
R A 


CTA GTG GAG 
L V E 


AGG 
R 


GAC 
D 


TGT 

C 


9275 
2964 


TTT 
F 


CTA 
L 


AAA 

K 


ATG 
M 


AAG 

K 


AAG 

K 


GGG 
G 


TGT 

C 


GCG 
A 


TTC 
F 


ACC 
T 


9335 
2964 


AGG 
R 


CTC 
L 


ATC 
I 


GAA 
E 


CTA 
L 


GTA 

V 


CAC 
H 


AGG 
R 


AAC 
N 


AAT 
N 


CTT 
L 


9395 
3004 


GTC 
V 


ACC 
T 


ACA 
T 


TGG 
W 


CTA 
L 


GCT 
A 


TAC 
Y 


ACC 
T 


TTC 
F 


GTG 
V 


AAT 
N 


9455 
3024 


CTA GGA GAG 
L G E 


AGA GTA 
R V 


ATC 
I 


CCC GAC 
P D 


CCT 
P 


GTA 
V 


GTT 
V 


9515 
3044 


GTG 
V 


GAC 
D 


ACG 
T 


TCA 

S 


GAG 
E 


GTT 

V 


GGG 
G 


ATC 
I 


ACA 

T 


ATA 
I 


ATT 

I 


9575 
3064 


GTG 
V 


ACA 
T 


CCT 
P 


GTC 
V 


TTG 
L 


GAA 
E 


AAA 

K 


GTA 
V 


GAG 
E 


CCT 
P 


GAC 
D 


9635 ATC GGG TTG GAT GAG GGT AAT TAC 
3084 IG LDEGNY 


CCA 
P 


GGG 
G 


CCT 
P 


9695 
3104 


GAA 
E 


ATA 

I 


CAC 
H 


AAC 
N 


AGG 
R 


GAT 
D 


GCG 
A 


AGG 
R 


CCC 

P 


TTC 
F 


ATC 

I 


9755 
3124 


TCA 
S 


AAT 
N 


AGG 
R 


GCA 
A 


AAG 

K 


ACT 
T 


GCT 
A 


AGA 
R 


AAT 
N 


ATA 

I 


AAT 
N 


9815 
3144 


ATA 
I 


CGA 
R 


GAC 
D 


TTG 

L 


ATG 
M 


GCT 
A 


GCA 
A 


GGG 

G 


CGC 
R 


ATG 
M 


TTA 

L 


9875 
3164 


GAG 
E 


CTG TCT 
L S 


GAA 
E 


ATG 
M 


GTC 
V 


GAT TTC 
D F 


AAG 

K 


GGG 
G 


ACT 
T 


9935 
3184 


CTA 
L 


AGT 

S 


CTC 
L 


GGG 
G 


CAA 
Q 


CCT 
P 


AAA 
K 


CCG 
P 


AAG 

K 


CAG 
Q 


GTT 

V 


9995 GAA CAG 
3204 E Q 


AAA 

K 


AAA 

K 


GAT GTG 
D V 


GAG 
E 


ATC 
I 


CCT 
P 


AAC 
N 


TGG 
W 


10055 
3224 


GAA GTG GCC 
EVA 


TTA 

L 


AAA 

K 


AAT 
N 


GAT 
D 


AAG 

K 


TAC 

Y 


TAC 
Y 


TTA 

L 


10115 
3244 


CAA 
0 


GCT 
A 


AAA 

K 


GCA 
A 


CTT GGG 

L G 


GCC 
A 


ACG 
T 


GAT 
D 


CAG 
0 


ACA 
T 


10175 
3264 


ACG 
T 


TAT 
Y 


GCC 
A 


ATG 
M 


AAG 

K 


CTA 
L 


TCT 

S 


ACC 
S 


TGG 
w 


TTC 

F 


CTC 
L 


10235 
3284 


ACT 
T 


CCA 
P 


CTG 

L 


TTT 
F 


GAG 
E 


GAA 
E 


TTG 
L 


TTG 
L 


CTA 
U 


COG 
R 


TGC 
C 


10295 
3304 


CAC 
H 


ATG GCA TCA GCT TAC 
M A S A Y 


CAA TTC 
Q L 


GCA 
A 


CAG 
Q 


GGT 
G 


10355 
3324 


CAC 
H 


CTA 

L 


GGT 
G 


ACA 
T 


ATA 
I 


CCA 
P 


GCC 
A 


AGA 
R 


AGG 
R 


GTG 
V 


AAG 
K 


10415 
3344 


TTG 
L 


AAA 

K 


GAT TTC 
D F 


ATA GAA GAA GAA 
IEEE 


GAG 
E 


AAG 
K 


AAA 

K 
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AAA 

K 


ATG 
M 


GTA 
V 


CTC 
L 


GGG 
G 


TGG 
W 


GCC 
A 


CCT 
P 


GCA 
A 


879-3 
2803 




AGG 
R 


ATC 
I 


AGA 
R 


TTG 
L 


CCA 
P 


ACA 
T 


GAC 
D 


AAC 

N 


TAT 
Y 


8854 
2823 




GAG 
E 


ATG 
M 


AAA 

K 


GCT 
A 


TTC 
F 


AAA 

K 


AAT 
N 


GTA 

V 


GGT 
G 


8914 
2843 




TTC 
F 


CTA 
L 


TGT 

C 


AGA 
R 


AAC 

N 


AGA 
R 


CCT 
P 


GGT 
G 


AGG 
R 


8974 
2863 




GAT 
D 


GAC 
D 


AAC 
N 


CTC 
L 


AGA 
R 


GAG 
E 


ATA 
I 


AAA 

K 


CCA 
P 


9034 
2883 




TAC 

Y 


AAA 

K 


GGG 
G 


GTC 
V 


ACA 
T 


GCA 
A 


AAA 

K 


ATT 
I 


GAC 
D 


9094 
2903 




AAG 

K 


TGG 
W 


GAG 
E 


GTG 
V 


GAA 


CAT 
H 


GGT 
G 


GTC 
V 


ATA 
I 


9150 
2923 




TTC 
F 


AAT GGT GCA 
N G A 


TAC 

Y 


TTA 
L 


GGT 
G 


GAC 
D 


GAG 
E 


9214 
2943 




GCA 
A 


ACT 
T 


ATA 

I 


ACC 
T 


AAA 

K 


AAC 
N 


ACA 
T 


GTA 
V 


CAG 
0 


9274 
2963 




TAT GAC 
Y D 


CTG 
L 


ACC 
T 


ATC 
I 


TCC 
S 


AAT 
N 


CTG 
L 


ACC 
T 


9334 
2983 




GAA 
E 


GAG 
E 


AAG 

K 


GAA 
E 


ATA 
I 


CCC 

p 


ACC 
T 


GCT 
A 


ACG 
T 


9394 
3003 




GAA 
E 


GAC 
D 


GTA 
V 


GGG 

G 


ACT 
T 


ATA 

I 


AAA 

K 


CCA 
P 


GTA 
V 


9454 
3023 




GAT ATC 
D I 


AAT 
N 


TTA 

L 


CAA 
Q 


CCA 
P 


GAG 
E 


GTG 
V 


CAA 
Q 


9514 
3043 


■4! 


GGA AGG GAA 
G R E 

GCC AGC GAC 
A S D 


ACC 
T 

AAC 
N 


CTG 
L 

CAA 

0 


ATG 
M 

AAC 
N 


ACA 
T 

TCG 
S 


ACG 
T 

GTG 
V 


GGA 
G 

AAG 

K 


9574 
3063 

9634 
3083 


SO 
i 

rvj 


GGA 
G 


ATA 

I 


CAG 
Q 


ACA 
T 


CAT 
H 


ACA 
T 


CTA 
L 


ACA 
T 


GAA 
E 


9694 
3103 


2 

D 


ATG 
M 


ATC 
I 


CTG 

L. 


GGC 
G 


TCA 

S 


AGG 
R 


AAT 
N 


TCC 
S 


ATA 
I 


9754 
3123 


O 


CTG 

L 


TAC 
Y 


ACA 

T 


GGA 

G 


AAT 

N 


GAC 
D 


CCC 
P 


AGG 
R 


GAA 
E 


9814 
3143 




GTA 

V 


GTA 
V 


GCA 
A 


CTG 
L 


AGG 
R 


GAT 
D 


GTC 

V 


GAC 
D 


CCT 
P 


9874 
3163 




TTT 
F 


TTA GAT 
L D 


AGG 
R 


GAG 
E 


GCC 
A 


CTG 
L 


GAG 
E 


GCT 
A 


9934 
3183 




ACC 
T 


AAG GAA 

K E 


GCT GTT 
A V 


AGG 
R 


AAT 

N 


TTG 
L 


ATA 
I 


9994 
3203 




TTT GCA TCA GAT GAC 
F A S D D 


CCA GTA 

P V 


TTT 
F 


CTG 

Lr 


10054 
3223 




GTA 

V 


GGA GAT GTT 
G D V 


GGA 
G 


GAG 
E 


CTA 
L 


AAA 

K 


GAT 
D 


10114 
3243 




AGA ATT 
R I 


ATA 

I 


AAG 

K 


GAG 
E 


GTA 
V 


GGC 
G 


TCA 
S 


AGG 
R 


10174 
3263 




AAG 
K 


GCA 
A 


TCA 
S 


AAC 
N 


AAA 

K 


CAG 
0 


ATG 
M 


AGT 
S 


TTA 
L 


10234 
3283 




CCA 
P 


CCT 
P 


GCA 
A 


ACT 
T 


AAG 
K 


AGC 
S 


AAT 
N 


AAG 

K 


GGG 
G 


10294 
3303 




AAC 
N 


TGG 
W 


GAG 
E 


CCC 
P 


CTC 
L 


GGT 
G 


TGC 
C 


GGG 
G 


GTG 
V 


10354 
3323 




ATA 
I 


CAC 
H 


CCA 
P 


TAT 

Y 


GAA 

E 


GCT 
A 


TAC 

Y 


CTG 
L 


AAG 
K 


10414 
3343 




CCT 
P 


AGG 
R 


GTT 
V 


AAG 

K 


GAT 
D 


ACA 
T 


GTA 
V 


ATA 
I 


AGA 

R 


10474 
3363 
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10475 GAG CAC AAC AAA TGG ATA CTT "AAA AAA -ATA AGG TTT CAA GGA AAC CTC AAC ACC AAG AAA 10534 

3364 EHNKWILKK I RFQGNLNTKK 3383 

10535 ATG CTC AAC CCG GGG AAA CTA TCT GAA CAG TTG GAC AGG GAG GGG CGC AAG AGG AAC ATC 10594 

3384 MLNPGKL SEQLDREGR K R N I 3403 

10595 TAC AAC CAC CAG ATT GGT ACT ATA ATG TCA AGT GCA GGC ATA AGG CTG GAG AAA TTG CCA 10654 

3404 YNHQI GT I MS SAG I RL E KL? 3423 

10655 ATA GTG AGG GCC CAA ACC GAC ACC AAA ACC TTT CAT GAG GCA ATA AGA GAT AAG ATA GAC 10714 

3424 IVRAQTDTKTFKEAI RDKI D 3443 

10715 AAG AGT GAA AAC CGG CAA AAT CCA GAA TTG CAC AAC AAA TTG TTG GAG ATT TTC CAC ACG 10774 

3444 KS'ENRQNPELHNKLLEI FHT 3463 

10775 ATA GCC CAA CCC ACC CTG AAA CAC ACC TAC GGT GAG GTG ACG TGG GAG CAA CTT GAG GCG 10834 

3464 I A'QPTLKHTYGEVTWEQ LEA 3483 

10835 GGG ATA AAT AGA AAG GGG GCA GCA GGC TTC CTG GAG AAG AAG AAC ATC GGA GAA GTA TTG 10894 

3484 GINRKGAAGFLEKKNIGEVL 3503 

10895 GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG GTC AGG GAT CTG AAG GCC GGG AGA AAG ATA 10954 

3504 DSEKH LVEQ LVRDLKAG RKI 3523 



10955 AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT GAG AAG AGA GAT GTC AGT GAT GAC TGG CAG 
3524 KYYETAI P KNEKRDVS DDWQ 



11014 
3543 



11015 GCA GGG GAC CTG GTG GTT GAG AAG AGG CCA AGA GTT ATC CAA TAC CCT GAA GCC AAG ACA 11074 

3544 AG DLVV E K R PRVI QY P E AKT 3563 

11075 AGG CTA GCC ATC ACT AAG GTC ATG TAT AAC TGG GTG AAA CAG CAG CCC GTT GTG ATT CCA 11134 

3564 RLAITKVMYNWVKQQPVVI p 3583 

11135 GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC ATC TTT GAT AAA GTG AGA AAG GAA TGG GAC 11194 

3584 GYEGKTPLFNIFDKVRKEW D 3603 

11195 TCG TTC AAT GAG CCA GTG GCC GTA AGT TTT GAC ACC AAA GCC TGG GAC ACT CAA GTG ACT 11254 

3604 SFNE PVAVS FDTKAWDTQVT 3623 

11255 AGT AAG GAT CTG CAA CTT ATT GGA GAA ATC CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC 11314 

3624 SKDLQLIGEIQKYYYKKEWH 3643 

11315 AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG ACA GAA GTA CCA GTT ATA ACA GCA GAT GGT 11374 

3644 KFIDT ITDHMTEVPVI T ADG 3663 

11375 GAA GTA TAT ATA AGA AAT GGG CAG AGA GGG AGC GGC CAG CCA GAC ACA AGT GOT GGC AAC 11434 

3664 EVYI RNGQRGSGQPDTSAGN 3683 

11435 AGC ATG TTA AAT GTC CTG ACA ATG ATG TAC GGC TTC TGC GAA AGC ACA GGG GTA CCG TAC 11494 

3684 SMLNV LTMMYGFCESTGVPY 3703 



o 



11495 AAG AGT TTC AAC AGG GTG GCA AGG ATC CAC GTC TGT GGG GAT GAT GGC TTC TTA ATA ACT 
3704 KSFNRVARI HVCGDDGFLIT 



11554 
3723 



11555 GAA AAA GGG TTA GGG CTG AAA TTT GCT AAC AAA GGG ATG CAG ATT CTT CAT GAA GCA GGC 11614 
3724 EKGLG LKFANKGMQI LHEAG 3743 

11615 AAA CCT CAG AAG ATA ACG GAA GGG GAA AAG ATG AAA GTT GCC TAT AGA TTT GAG GAT ATA 11674 



3744 K 



M 



3763 



11675 GAG TTC TGT TCT CAT ACC CCA GTC CCT GTT AGG TGG TCC GAC AAC ACC AGT AGT CAC ATG 11734 

3764 EFCSHTPVPVRWS DNTSSHM 3783 

11735 GCC GGG AGA GAC ACC GCT GTG ATA CTA TCA AAG ATG GCA ACA AGA TTG GAT TCA AGT GGA 11794 

3784 AGRDTAVI L S KMATR L.DS.SG 3803 



11795 GAG AGG GGT ACC ACA GCA TAT GAA AAA GCG GTA GCC TTC AGT TTC TTG CTG ATG TAT TCC 
3804 ERGTTAYEKAVAFSFLLMYS 



11854 

3823 



11855 TGG AAC CCG CTT GTT AGG AGG ATT TGC CTG TTG GTC CTT TCG CAA CAG CCA GAG ACA GAC 11914 

3824 WNPLVRRI CLLVLSQQPETD 3843 

11915 CCA TCA AAA CAT GCC ACT TAT TAT TAC AAA GGT GAT CCA ATA GGG GCC TAT AAA GAT GTA 11974 

3844 PSKHATYYYKGDPIGAYKDV 3863 

11975 ATA GGT CGG AAT CTA AGT GAA CTG AAG AGA ACA GGC TTT GAG AAA TTG GCA AAT CTA AAC 12034 

3864 IGRNLSELKRTGFEKLANLN 3883 

12035 CTA AGC CTG TCC ACG TTG GGG ATC TGG ACT AAG CAC ACA AGC AAA AGA ATA ATT CAG GAC 12094 

3884 LSLSTLGIWTKHTSKRI IQ D 3903 

12095 TGT GTT GCC ATT GGG AAA GAA GAG GGC AAC TGG CTA GTT AAC GCC GAC AGG CTG ATA TCC 12154 

3904 CVAIGKEEGNWLVNADRLIS 3923 

12155 AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT 12214 

3924 SKTGHLYI PDKGFTLQGKHY 3943 
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12215 GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC CCG GTC ATG GGG GTT GGC ACT GAG AGA TAC 12274 
3944 EOLQLRTE7NPVMCVGTERY 3963 

12275 AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG AGA AGG TTG AAA ATT CTG CTC ATG ACG GCC 12334 
3964 KLGPTVNLLLRRLKILLMTA 3983 

12335 GTC GGC GTC AGC AGC TGA gacaaaacgtatacaccgtaaacaaat taatccacg tacatagtgcatacaaatat 12408 
3984 V G V S S * 3989 

12409 agccgggaccgcccaccccaagaagacgacacgcccaacacgcacagccaaacagtagtcaagaccatctaccccaagat 12488 

12489 aacactacacccaatgcacacagcacctcagccgtacgaggacacgcccgacgtccatagccggaccagggaagaccccc 12568 

12569 aacagccccc 12578 



00 



3 
O 

to 

x: 
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BVDV NADL clns- (inf. clone) -> Genes 
DNA sequence 12308 b.p. gtatacgagaat 



ctaacagccccc linear 



1 gtatacgagaat tagaaaaggcactcgtatacgtattgggcaatcaaaaataataattaggcctagggaacaaatccctc 

81 tcagcgaaggccgaaaagaggctagccacgccctnagtaggaccagcataacgaggggggcagcaacagtggtgagttcg 

161 ttggatggcttaagccctgagtacagggtagtcgtcagtggttcgacgccttggaataaaggtctcgagatgccacgtgg 

241 acgagggcacgcccaaagcacatcctaacctgagcgggggtcgcccaggtaaaagcagttttaaccgactgttacgaata 

321 cagcctgatagggtgctgcagaggcccactgtattgctactaaaaatctctgctgtacatggcac ATG GAG TTG 
1 MEL 

395 ATC ACA AAT GAA CTT TTA TAC AAA ACA TAC AAA CAA AAA CCC GTC GGG GTG GAG GAA CCT 
4ITNELLYKTYKQKPVGVEEP 

455 GTT TAT GAT CAG GCA GGT GAT CCC TTA TTT GGT GAA AGG GGA GCA GTC CAC CCT CAA TCG 
24 V Y D Q A GDPLFGERGAVH P Q S 

515 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TCC TTA 
44TLKLPHKRGERDVPTNLASL 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TAC CTG 
64 P K RGDCR SGNSRGPV SG I YL 

635 AAG CCA GGG CCA CTA TTT TAC CAG GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CCG CTG 
84KPGPLFYQDYKGPVYHRAPL 

695 GAG CTC TTT GAG GAG GGA TCC ATG TGT GAA ACG ACT AAA CGG ATA GGG AGA GTA ACT GGA 
104 ELFEEGSMCETTKRI GRVTG 

755 ACT GAC GGA AAG CTG TAC CAC ATT TAT GTG TGT ATA GAT GGA TGT ATA ATA ATA AAA AGT 
124 SDG KLYHIYVCIDGC I I I KS 

815 GCC ACG AGA AGT TAC CAA AGG GTG TTC AGG TGG GTC CAT AAT AGG CTT GAC TGC CCT CTA 
144 A T R S Y Q R V F R W . V H N R L D. C P L 

875 TGG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAG AAA 
164 WVTTCSDTKEEGATKKKTQK 

935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AGC 
184 PDRLERGKMKIVPKESEKDS 

995 AAA ACT AAA CCT CCG GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AGG AAG 
204 KTKPPDATIVVEGVKYQV RK 

1055 AAG GGA AAA ACC AAG AGT AAA AAC ACT CAG GAC GGC TTG TAC CAT AAC AAA AAC AAA CCT 
224 KGKTK SKNTQDGLYH NKN KP 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GOG TGG GCA ATA ATA GCT ATA GTT 
244 QESRKKLEKALLAWAI I AIV 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TGG AAC CTA CAA GAT AAT GGG ACG 
264 LFQVTMGENITQWNLQDNGT 

1235 GAA GGG ATA CAA CGG GCA ATG TTC CAA AGG GGT GTG AAT AGA AGT TTA CAT GGA ATC TGG 
284 EGIQRAMFQRGVNRSLHGIW 

1295 CCA GAG AAA ATC TGT ACT GGT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AAA ACA 
304 PEKICTGVPSHLATDI ELKT 

1355 ATT CAT GGT ATG ATG GAT GCA AGT GAG AAG ACC AAC TAC ACG TGT TGC AGA CTT CAA CGC 
324 I HGMMDASEKTNYTCCR LQR 

1415 CAT GAG TGG AAC AAG CAT GGT TGG TGC AAC TGG TAC AAT ATT GAA CCC TGG ATT CTA GTC 
344 HEWNKHGWCNWYN I E PWI LV 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC ACT 
364 MNRTQANLTEGQPPR ECA.VT 

1535 TGT AGG TAT GAT AGG GCT AGT GAC TTA AAC GTG GTA ACA CAA GCT AGA GAT AGC CCC ACA 
384 CRYDRASDLNVVTQARDS PT 

1595 CCC TTA ACA GGT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CGG GGC 
404 PLTGCKKGKN FSFAGI L M R G . 

1655 CCC TGC AAC TTT GAA ATA GCT GCA AGT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT AGT 
424 PCNFEI AASDVLFKEHERIS 

1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GGT GCC 
444 MFQDTT LY LVDGLTN S LEGA 



80 

16C 

240 

320 

394 
3 

454 

23 

514 
43 

574 
63 

634 
83 

694 
103 

754 
123 

814 
143 

874 
163 

934 
183 

994 
203 

1054 
223 

1114 
243 

1174 
263 

1234 
283 

1294 
303 

1354 
323 

1414 
343 

1474 
363 

1534 
383 

1S94 
403 

1654 
423 

1714 
443 

1774 
463 
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1775 AGA CAA GGA ACC GCT AAA CTG ACA ACC TGG TTA GCC AAC CAG CTC GGG ATA CTA GGA AAA 1834 
464 RQGTAKLTTWLGKQLGI LGK 483 



1835 AAG TTG GAA AAC AAG AGT AAG ACG TGG TTT GGA GCA TAC GCT GCT TCC CCT TAC TGT GAT 1894 

484 KLE NKSK TWFGAYAASPYCD 503 

1895 GTC GAT CGC AAA ATT GGC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TCC TTA CCC 1954 

504 VDRK I GYIWYTKNCT PACLP 523 

1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAC ACC AAT GCA GAG GAC GGC AAG ATA 2014 

524 KNTK I VGPGK FDTNA EDGK I 543 

2015 TTA CAT GAG ATG GGG GCT CAC TTG TOG GAG CTA CTA CTA CTT TCT TTA GTC CTG CTG TCC 2074 

544 LHEMGGHLSEVL L LS LVVLS 563 

2075 GAC TTC GCA CCG GAA ACA GCT ACT CTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 D F A PETASVMY LI LH F S I PQ 583 

2135 ACT CAC CTT GAT CTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTC GAG CTG ACA 2194 

584 SHVDVMDCOKTQLNLT'VELT 603 

2195 ACA GCT GAA CTA ATA CCA GGG TOG GTC TGG AAT CTA GGC AAA TAT CTA TGT ATA AGA CCA 2254 

604 TAEV 1 PGSVWNLGKY VCI RP 623 



2255 AAT TGG TGG CCT TAT GAG ACA ACT CTA CTG TTG GCA TTT GAA GAG GTG AGO CAG CTG GTC 2314 

624 N W W PY ETTVV LAF EE VSQVV 643 

2315 AAG TTA GTG TTG AGG GCA CTC AGA GAT TTA ACA CGC ATT TGG AAC GCT GCA ACA ACT ACT 2374 

644 KLVLRALRDLTRIWNAATTT 663 

2375 GCT TTT TTA CTA TGC CTT CTT AAG ATA GTC AGG GGC CAG ATG CTA CAG GGC ATT CTG TGG 2434 

664 AFLVCLVKIVRGQMVQGILW 683 

2435 CTA CTA TTG ATA ACA GGG CTA CAA GGG CAC TTG GAT TGC AAA CCT GAA TTC TOG TAT GCC 2494 

684 LLLITGVQGHLDCKPEFSYA 703 

2495 ATA GCA AAG GAC GAA AGA ATT GCT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 IAKDERIGQLGAEGLTTTWK 723 

2555 GAA TAC TCA CCT GGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 2614 

724 EYS PGMKLEDTMVI AWCEDG 743 

2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMY LQRCTRETRYL AI LHT 763 

2675 AGA GCC TTG CCG ACC ACT GTG CTA TTC AAA AAA CTC TTT GAT GGG CCA AAG CAA GAG GAT 2734 

764 RALPTSVVFKKLFDG RKQED 783 



2735 CTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 
784 VVEMNDNFEFGLC PC DAKPI 



i 



803 3 
O 

2795 GTA AGA GGG AAG TTC AAT ACA ACG CTG CTG AAC GGA CCG GCC TTC CAG ATG CTA TGC CCC 2854 mm 

804 VRGKFNTTLLNGPAFQMVCP 823 

2855 ATA GGA TGG ACA GGG ACT GTA AGC TGT ACG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 

824 IGWTGTVSCTSFNMDTLATT 843 

2915 GTG GTA CGG ACA TAT AGA AGG TCT AAA CCA TTC CCT CAT AGG CAA GGC TGT ATC ACC CAA 2974 

844 VVRTYRRSKPFPHROGCITQ 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEDL HNC I LGGNWTCVP 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TGT GGC TAT CAA 3094 

884 GDQLLYKGGS I ESCKWCGYQ 903 

3095 TTT AAA GAG ACT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHY PIGKCKLENE 923 

3155 ACT GCT TAC AGG CTA GTA GAC ACT ACC TCT TGC AAT AGA GAA GCT GTG GCC ATA GTA CCA 3214 

924 T G Y R LVDSTSCN R EG VA I V P 943 

3215 CAA GGG ACA TTA AAG TGC AAG ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATG GAT ACC 3274 

944 QGTLKCKICKTTVQV IAMDT 963 

3275 AAA CTC GGA CCT ATG CCT TGC AGA CCA TAT GAA ATC ATA TCA ACT GAG GGG CCT GTA GAA 3334 

964 KLGPMPCRPYEI ISSEGPVE 983 

3335 AAG ACA GOG TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 

984 KTACTFNYTKTLKNK YFEPR 1003 

3395 GAC AGC TAC TTT CAG CAA TAC ATG CTA AAA GGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 3454 
1004 DSYFOQYMLKGEYQY WFDLE 1023 

3455 GTG ACT GAC CAT CAC CGG GAT TAC TTC GCT GAG TCC ATA TTA GTG GTG GTA GTA GCC CTC 3514 
1024 VTDHHRDYFAESI LVVVVAL 1043 
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- 351 S TTG GGT GGC AGA TAT' GTA CTT TGC TTA CTG GTT ACA TAC ATC GTC TTA TCA GAA CAG AAG 3574 - - - - 

1044 LCGRYVLWLLVTYMVLSEQK 1063 

3575 GCC TTA GGG ATT CAG TAT GGA TCA GGG GAA GTG GTG ATG ATG GGC AAC TTG CTA ACC CAT 3634 

1064 ALG IQYGSGEVVMMGNLLTH 1083 

3635 AAC AAT ATT GAA GTG GTG ACA TAC TTC TTG CTG CTG TAC CTA CTG CTG AGG GAG GAG AGC 3694 

1084 N N I EVVTYFLLLYLLLREES 1103 

3695 GTA AAG AAG TGG GTC TTA CTC TTA TAC CAC ATC TTA GTG GTA CAC CCA ATC AAA TCT GTA 3754 

1104 VK KWV L L LYH I LVVH P I KSV 1123 

3755 ATT GTG ATC CTA CTG ATG ATT GGG GAT GTG' GTA AAG GCC GAT TCA GGG GGC CAA GAG TAC 3814 

1124 IVI LLMIGDVVKADSGGQEY 1143 

3815 TTG GGG AAA ATA GAC CTC TGT TTT ACA ACA GTA GTA CTA ATC GTC ATA GGT TTA ATC ATA 3874 

1144 LGK IDLC FTTVVLIVI G L I I 1163 

3875 GCC AGG CGT GAC CCA ACT ATA GTG CCA CTG GTA ACA ATA ATG GCA GCA CTG AGG GTC ACT 3934 

1164 ARRDPTI VPLVTIMAALRVT 1183 

3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC GCT GTG GCG GTC ATG ACT ATA ACC CTA CTG 3994 

1184 ELTHQPGVDIAVAVMTITLL 1203 

3995 ATG GTT AGC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TGG TTA CAG TGC ATT CTC AGC 4054 

1204 MVSYVTDYFRYKKWLQC I LS 1223 

4055 CTG GTA TCT GCG GTG TTC TTG ATA AGA AGC CTA ATA TAC CTA GGT AGA ATC GAG ATG CCA 4114 

1224 LVSAVFLIRSL1YLGRI EMP 1243 

4115 GAG GTA ACT ATC CCA AAC TGG AGA CCA CTA ACT TTA ATA CTA TTA TAT TTG ATC TCA ACA 4174 

1244 E V T I PNWRPLTLILLYL I ST 1263 

4175 ACA ATT GTA ACG AGG TGG AAG GTT GAC GTG GCT GGC CTA TTG TTG CAA TGT GTG CCT ATC 4234 

1264 TIVTRWKVDVAGLLLQCVP I 1283 

4235 TTA TTG CTG GTC ACA ACC TTG TGG GCC GAC TTC TTA ACC CTA ATA CTG ATC CTG CCT ACC 4294 

1284 LLLVTTLWADFLTLILI LPT 1303 

4295 TAT GAA TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AGG ACT GAT ATA GAA AGA ACT TGG 4354 

1304 YELVKLYYLKTVRTDIERSW 1323 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG AGT GGA GAG 4414 

1324 LGGIDYTRVDSIYDVDESGE 1343 <M 

4415 GGC GTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GGG AAT TTT TCT ATA CTC TTG CCC 4474 M 

1344 GVYLFPSRQKAQGNFSI LLP 1363 S 

4475 CTT ATC AAA GCA ACA CTG ATA AGT TGC GTC AGC AGT AAA TGG CAG CTA ATA TAC ATG AGT 4534 3 

1364 LIKATLI SCVSSKWQLIYMS 1383 O 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AGG AAA GTT ATA GAA GAG ATC TCA GGA 4594 

1384 YLTLDFMYYKHRKVI E E I SG 1403 

4595 GGT ACC AAC ATA ATA TCC AGG TTA GTG GCA GCA CTC ATA GAG CTG AAC TGG TCC ATG GAA 4654 

1404 GTNIISRLVAALIELNWSME 1423 

4655 GAA GAG GAG AGC AAA GGC TTA AAG AAG TTT TAT CTA TTG TCT GGA AGG TTG AGA AAC CTA 4714 

1424 EEESKGLKKFYLLSGRLRNL 1443 

4715 ATA ATA AAA CAT AAG GTA AGG AAT GAG ACC GTG GCT TCT TGG TAC GGG GAG GAG GAA GTC 4774 

1444 I I KHKVRNETVASWYGEEEV 1463 

4775 TAC GGT ATG CCA AAG ATC ATG ACT ATA ATC AAG GCC AGT ACA CTG AGT AAG AGC AGG CAC 4834 

1464 YGMPKIMTI IKASTLSKSRH 1483 

4835 TGC ATA ATA TGC ACT GTA TGT GAG GGC CGA GAG TGG AAA GGT GGC ACC TGC CCA AAA TGT 4894 

1484 CI I CTVC EGREWKGGTC PKC 1503 

4895 GGA CGC CAT GGG AAG CCG ATA ACG TGT GGG ATG TOG CTA GCA GAT TTT GAA GAA AGA CAC 4954 

1504 GRHGKPITCGMSLADFEERH 1523 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GGC AAC TTT GAG gggccc TTC AGG CAG GAA TAC AAT 5014 

1524 YKR IFIREGNFE FRQEYN 1541 

5015 GGC TTT GTA CAA TAT ACC GCT AGG GGG CAA CTA TTT CTG AGA AAC TTG CCC GTA CTG GCA 5074 

1542 GFVQYTARGQLFLRNLPVLA 1561 

5075 ACT AAA GTA AAA ATG CTC ATG GTA GGC AAC CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT 5134 

1562 TKVKMLMVGNLGEEIGNLEH 1581 

5135 CTT GGG TGG ATC CTA AGG GGG CCT GCC GTG TGT AAG AAG ATC ACA GAG CAC GAA AAA TGC 5194 

1582 LGWILRG PAVCKKITEH EKC 1601 

5195 CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA TTT TTC GGG ATC ATG CCA AGG GGG ACT ACA 5254 

1602 HINILDKLTAFFGIMPRGTT 1621 
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5255 CCC AGA GCC CCG GTG AGG TTC CCT ACG AGC TTA CTA AAA GTG AGG AGG GGT CTG GAG ACT 
1622 PRAPVRFPTSLLKVRRGLET 



5314 
1641 



5315 GCC TGG GCT TAC ACA CAC CAA GGC GGG ATA ACT TCA GTC GAC CAT CTA ACC GCC GGA AAA 5374 
1642 AWAYTHQ GGI SSVDHVTAGK 1661 

5375 GAT CTA CTG GTC TGT GAC AGC ATG GGA CGA ACT AGA GTG GTT TGC CAA AGC AAC AAC AGG 5434 
1662 DLLVCDSKGRTRVVCQSNNR 1681 

5435 TTC ACC GAT GAG ACA GAG TAT GGC GTC AAG ACT GAC TCA GGG TGC CCA GAC GGT GCC AGA 5494 
1682 LTDET EYGVKTDSGC PDGAR 1701 

5495 TGT TAT GTG TTA AAT CCA GAG GCC GTT AAC ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC 5554 
1702 CYVLNPEAVNISGSKGAVVH 1721 

5555 CTC CAA AAG ACA GGT GGA GAA TTC ACG TGT GTC ACC GCA TCA GGC ACA CCG GCT TTC TTC 5614 
1722 LQKTGGEFTCVTASGTPAFF 1741 

5615 GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC TTC CCT ATA TTT GAA GCC TCC AGC GGG AGG 5674 
1742 DLKNLKGWSGLPIFEASSGR 1761 

5675 GTG GTT GGC AGA GTC AAA GTA GGG AAG AAT GAA GAG TCT AAA CCT ACA AAA ATA ATG ACT 5734 
1762 VVGRVKVGKNEESKPTKIMS 1781 

5735 GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC 5794 
1782 GIQTVSKNRADLTEMVKKIT 1801 

5795 AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA 5854 
1802 SMNRGDFKQ ITLATGAGKTT 1821 

5855 GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA 5914 
1822 ELPKAVIEEIGRHKRVLVLI 1841 

5915 CCA TTA AGG GCA GCG GCA GAG TCA GTC TAC CAG TAT ATG AGA TTG AAA CAC CCA AGC ATC 5974 
1842 PLRAAAESVYQYMRLKHPSI 1861 

5975 TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA GAG GGG GAC ATG GCA ACC GGG ATA ACC TAT 6034 
1862 SFNLRIGDMKEGDMATGITY 1881 

6035 GCA TCA TAC GGG TAC TTC TGC CAA ATG CCT CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA 6094 
1882 ASYGYFCQMPQPKLRAAMVE 1901 

6095 TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT TGT GCC ACT CCT GAA CAA CTG GCA ATT ATC 6154 
1902 YSY I FLDEYHCATPEQLAI I 1921 

6155 GGG AAG ATC CAC AGA TTT TCA GAG ACT ATA AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA 6214 
1922 GKI HRFSES I RV. VAMTATPA 1941 

6215 GGG TCG GTG ACC ACA ACA GGT CAA AAG CAC CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA 6274 
1942 GSVTTTGQKHPIEEF IAPEV 1961 

6275 ATG AAA GGG GAG GAT CTT GGT ACT CAG TTC CTT GAT ATA GCA GGG TTA AAA ATA CCA GTG 6334 
1962 MKGEDLGSQFLDIAGLKIPV 1981 

6335 GAT GAG ATG AAA GGC AAT ATG TTG GTT TTT GTA CCA ACG AGA AAC ATG GCA GTA GAG GTA 6394 
1982 DEMKGNMLVFV PTRNMAVEV 2001 

6395 GCA AAG AAG CTA AAA GCT AAG GGC TAT AAC TCT GGA TAC TAT TAC ACT GGA GAG GAT CCA 6454 
2002 AKKLKAKGYNSGYYYSGEDP 2021 

6455 GCC AAT CTG AGA GTT GTG ACA TCA CAA TCC CCC TAT GTA ATC GTG GCT ACA AAT GCT ATT 6514 
2022 ANLRVVTSQSPYVIVATNAI 2041 

6515 GAA TCA GGA GTG ACA CTA CCA GAT TTG GAC ACG GTT ATA GAC ACG GGG TTG AAA TGT GAA 6574 
2042 ESGVTLPDLDTVI DTGLKCE 2061 

6575 AAG AGG GTG AGG GTA TCA TCA AAG ATA CCC TTC ATC GTA ACA GGC CTT AAG AGG ATG GCC 6634 
2062 KRVRVSSKI PFIVTG LKRMA 2081 

6635 GTG ACT GTG GGT GAG CAG GCG CAG CGT AGG GGC AGA GTA GGT AGA GTG AAA CCC GGG AGG 6694 
2082 VTVGEQAQRRGRVGRVKPGR 2101 

6695 TAT TAT AGG AGC CAG GAA ACA GCA ACA GGG TCA AAG GAC TAC CAC TAT GAC CTC TTG CAG 6754 
2102 YYRSQETATGSKDYKYDLLQ 2121 

6755 GCA CAA AGA TAC GGG ATT GAG GAT GGA ATC AAC GTG ACG AAA TCC TTT AGG GAG ATG AAT 6814 
2122 AQRYGI EDGINVTKSFREMN 2141 

6815 TAC GAT TGG AGC CTA TAC GAG GAG GAC AGC CTA CTA ATA ACC CAG CTG GAA ATA CTA AAT 6874 
2142 YDWSLYEEDSLLITQLEI LN 2161 

6875 AAT CTA CTC ATC TCA GAA GAC TTG CCA GCC GCT GTT AAG AAC ATA ATG GCC AGG ACT GAT 6934 
LL1 SEDLPAAVKNI MARTD 2181 



2162 N 

6935 C 
2182 H 



3 
O 



6935 CAC CCA GAG CCA ATC CAA CTT GCA TAC AAC AGC TAT GAA GTC CAC GTC CCG GTC CTG TTC 6994 
EPIQLAYNSYEVQVPVLF 2201 
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6995 CCA AAA ATA AGG AAT" GGA GAA GTC ACA GAC ACC TAC GAA AAT TAC TCG TTT CTA AAT GCC 
2202 P K I RNGEVTDTYENYSFLNA 

7055 AG A AAG TTA GGG GAG GAT GTG CCC GTG TAT ATC TAC GCT ACT GAA GAT GAG GAT CTG GCA 
2222 R K L G E D V PVYIYATEDEDLA 

7115 GTT GAC CTC TTA GGG CTA GAC TGG CCT GAT CCT GGG AAC CAG CAG GTA GTG GAG ACT GGT 
2242 VDLLGLDWPDPGNQQVVETG 

7175 AAA GCA CTG AAG CAA GTG ACC GGG TTG TCC TCG GCT GAA AAT GCC CTA CTA GTG GCT TTA 
2262 KALKQVTGLSS AENAL.L.VAL 

7235 TTT GGG TAT GTG GGT TAC CAG GCT CTC TCA AAG AGG CAT GTC CCA ATG ATA ACA GAC ATA 
2282 FGYVGYQALSKRHVPMITDI 

7295 TAT ACC ATC GAG GAC CAG AGA CTA GAA GAC ACC ACC CAC CTC CAG TAT GCA CCC AAC GCC 
2302 YTI EDQR LEDTTHLQYAPNA 

7355 ATA AAA ACC GAT GGG ACA GAG ACT GAA CTG AAA GAA CTG GCG TCG GGT GAC GTG GAA AAA 
2322 I KTDGTETELKELASGDVEK 

7415 ATC ATG GGA GCC ATT TCA GAT TAT GCA GCT GGG GGA CTG GAG TTT GTT AAA TCC CAA GCA 
2342 IKGAISDYAAGGLEFVKSQA 

7475 GAA AAG ATA AAA ACA GCT CCT TTG TTT AAA GAA AAC GCA GAA GCC GCA AAA GGG TAT GTC 
2362 EKI KTAP LFKENAEAAKGYV 

7535 CAA AAA TTC ATT GAC TCA TTA ATT GAA AAT AAA GAA GAA ATA ATC AGA TAT GGT TTG TGG 
2382 QKFIDSLIENKEEI IRYGLW 

7595 GGA ACA CAC ACA GCA CTA TAC AAA AGC ATA GCT GCA AGA CTG GGG CAT GAA ACA GCG TTT 
2402 GTH TALY K S I AAR LGH ETA F 

7655 GCC ACA CTA GTG TTA AAG TGG CTA GCT TTT GGA GGG GAA TCA GTG TCA GAC CAC GTC AAG 
2422 AT LVLKW LA FGG E SVS DH VK 

7715 CAG GCG GCA GTT GAT TTA GTG GTC TAT TAT GTG ATG AAT AAG CCT TCC TTC CCA GGT GAC 
2442 QAAV DLVVYYVMNK PSF PGD 

7775 TCC GAG ACA CAG CAA GAA GGG AGG CGA TTC GTC GCA AGC CTG TTC ATC TCC GCA CTG GCA 
2462 SETQQEG RRFVASLFI SALA 

7835 ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC AAT CTC TCT AAA GTG GTG GAA CCA GCC CTG 
2482 TYTYKTWNYHNLSKVVEPAL 

7895 GCT TAC CTC CCC TAT GCT ACC AGC GCA TTA AAA ATG TTC ACC CCA ACG CGG CTG GAG AGC 
2502 AY L PYAT S A LKM FT PTR L E S 

7955 GTG GTG ATA CTG AGC ACC ACG ATA TAT AAA ACA TAC CTC TCT ATA AGG AAG GGG AAG AGT 
2522 VVI LSTTIYKTYLSIRKGKS 

8015 GAT GGA TTG CTG GGT ACG GGG ATA AGT GCA GCC ATG GAA ATC CTG TCA CAA AAC CCA GTA 
2542 DGLLGTGISAAMEILSQNPV 

8075 TCG GTA GGT ATA TCT GTG ATG TTG GGG GTA GGG GCA ATC GCT GCG CAC AAC GCT ATT GAG 
2562 SVGI SVM LGVGAI AAHNA I E 

8135 TCC AGT GAA CAG AAA AGG ACC CTA CTT ATG AAG GTG TTT GTA AAG AAC TTC TTG GAT CAG 
2582 SSEQKRTLLMKVFVKNFLDQ 

8195 GCT GCA ACA GAT GAG CTG GTA AAA GAA AAC CCA GAA AAA ATT ATA ATG GCC TTA TTT GAA 
2602 AATDELVKENPEKI IMALFE 

8255 GCA GTC CAG ACA ATT GGT AAC CCC CTG AGA CTA ATA TAC CAC CTG TAT GGG GTT TAC TAC 
2622 AVQTIGNPLRLIYHLYGVYY 

8315 AAA GGT TGG GAG GCC AAG GAA CTA TCT GAG AGG ACA GCA GGC AGA AAC TTA TTC ACA TTG 
2642 KGWEAKE LSERTAGRNLFTL 

8375 ATA ATG TTT GAA GCC TTC GAG TTA TTA GGG ATG GAC TCA CAA GGG AAA ATA AGG AAC CTG 
2662 I MFEAFE LLGMDSQGKI RNL 

8435 TCC GGA AAT TAC ATT TTG GAT TTG ATA TAC GGC CTA CAC AAG CAA ATC AAC AGA GGG CTG 
2682 SGNYILDL IYGLHKQINRGL 

8495 AAG AAA ATG GTA CTG GGG TGG GCC CCT GCA CCC TTT AGT TGT GAC TGG ACC CCT AGT GAC 
2702 KKMVLGWAPAPFSCDWT PSD 

8555 GAG AGG ATC AGA TTG CCA ACA GAC AAC TAT TTG AGG GTA GAA ACC AGG TGC CCA TGT GGC 
2722 ERIRLPTDNYLRVETRCPCG 

8615 TAT GAG ATG AAA GCT TTC AAA AAT GTA GGT GGC AAA CTT ACC AAA GTG GAG GAG AGC GGG 
2742 YEMKAFKNVGGKLTKVEESG 

8675 CCT TTC CTA TGT AGA AAC AGA CCT GGT AGG GGA CCA GTC AAC TAC AGA GTC ACC AAG TAT 
2762 PFLCRNR PGRGPVNYRVTKY 



7054 




2221 




7114 




2241 




7174 




2261 




7234 




2281 




7294 




2301 




7354 




2321 




7414 




2341 




7474 




2361 




7534 




2381 




7594 




2401 




7654 




2421 




7714 




2441 




7774 




2461 




7834 




2481 




7894 


IT) 


2501 








7954 




2521 




8014 




2541 


O 


8074 






2561 




8134 




2581 




8194 




2601 




8254 




2621 




8314 




2641 




8374 




2661 




8434 




2681 




8494 




2701 




8554 




2721 




8614 




2741 




8674 




2761 




8734 




2781 
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8735 TAC GAT GAC AAC CTC AGA GAG ATA AAA CCA GTA CCA AAG TTG GAA GCA CAG GTA GAG CAC 
2782 YDDNLREIKPVAKLEGQVEH 



8794 
2801 



8795 TAC TAC AAA GGG GTC ACA GCA AAA ATT GAC TAC AGT AAA GGA AAA ATG CTC TTG GCC ACT 
2802 YYKGVTAKXDYSKGKM LLAT 



8854 
2821 



8855 GAC AAG TGG GAG GTG GAA CAT GGT GTC ATA ACC AGG TTA GOT AAG AGA TAT ACT GGG GTC 
2822 DKWEVEHGVITRLAKRYTGV 



8914 
2841 



8915 GGG TTC AAT GGT GCA TAC TTA GGT GAC GAG CCC AAT CAC CGT GCT CTA GTG GAG AGG GAC 
2842 GFNGAY LGDE PNHRA LVERD 



8974 
2861 



8975 TGT GCA ACT ATA ACC AAA AAC ACA GTA CAG TTT CTA AAA ATG AAG AAG GGG TGT GCG TTC 
2862 CAT1T KNTVQFLKM K KGCAF 



9034 
2881 



9035 ACC TAT GAC CTG ACC ATC TCC AAT CTG ACC AGG CTC ATC GAA . CTA GTA CAC AGG AAC AAT 
2882 TYDLT I SNLTRLIELVHRNN 



9094 
2901 



9095 CTT GAA GAG AAG GAA ATA CCC ACC GCT ACG GTC ACC ACA TGG CTA GCT TAC ACC TTC GTG 
2902 LEEKE I PTATVTTW LA YTFV 



9154 
2921 



9155 AAT GAA GAC GTA GGG ACT ATA AAA CCA GTA CTA GGA GAG AGA GTA ATC CCC GAC CCT GTA 
2922 NEDVGTI KPVLGERVI PDPV 



9214 
2941 



9215 GTT GAT ATC AAT TTA CAA CCA GAG GTG CAA GTG GAC ACG TCA GAG GTT GGG ATC ACA ATA 
2942 VDINLQPEVQVDTSEVGITI 



9274 
2961 



9275 ATT GGA AGG GAA ACC CTG ATG ACA ACG GGA GTG ACA CCT GTC TTG GAA AAA GTA GAG CCT 
2962 IGRETLMT TGVTPVLEKVEP 



9334 
2981 



933S GAC GCC AGC GAC AAC CAA AAC TOG GTG AAG ATC GGG TTG GAT GAG GGT AAT TAC CCA GGG 
2982 DA S.DNQNSVK I G LD EG NY PG 



9394 
3001 



9395 CCT GGA ATA CAG ACA CAT ACA CTA ACA GAA GAA ATA CAC AAC AGG GAT GCG AGG CCC TTC 
3002 PGIQTHTLTEEIHNRDARPF 



9454 
3021 



9455 ATC ATG ATC CTG GGC TCA AGG AAT TCC ATA TCA AAT AGG GCA AAG ACT GCT AGA AAT ATA 
3022 IMI LG SR NSI SNRAKTAR NI 



9514 
3041 



9515 AAT CTG TAC ACA GGA AAT GAC CCC AGG GAA ATA CGA GAC TTG ATG GCT GCA GGG CGC ATG 9574 

3042 NLYTGNDPREI RDLMAAGRM 3061 

9575 TTA GTA GTA GCA CTG AGG GAT GTC GAC CCT GAG CTG TCT GAA ATG GTC GAT TTC AAG GGG 9634 

3062 LVVALRDVDPELSEMVDFKG 3081 



I 



9635 ACT TTT TTA GAT AGG GAG GCC CTG GAG GCT CTA AGT CTC GGG CAA CCT AAA CCG AAG CAG 9694 

3082 TFLDR EALEALSLGQ P KPKQ 3101 

9695 GTT ACC AAG GAA GCT GTT AGG AAT TTG ATA GAA CAG AAA AAA GAT GTG GAG ATC CCT AAC 9754 

3102 VTKEAVRNLIEQKKDVEI PN 3121 



D 
O 



9755 TGG TTT GCA TCA GAT GAC CCA GTA TTT CTG GAA GTG GCC TTA AAA AAT GAT AAG TAC TAC 9814 

3122 WFASDDPVFLEV ALKNDKYY 3141 

9815 TTA GTA GGA GAT GTT GGA GAG CTA AAA GAT CAA GCT AAA GCA CTT GGG GCC ACG GAT CAG 9874 

3142 LVGDVG ELKDQAKA LGATD0 3161 

9875 ACA AGA ATT ATA AAG GAG GTA GGC TCA AGG ACG TAT GCC ATG AAG CTA TCT AGC TGG TTC . 9934 

3162 TRI I K E VGSRTYAMK L SSWF 3181 

9935 CTC AAG GCA TCA AAC AAA CAG ATG AGT TTA ACT CCA CTG TTT GAG GAA TTG TTG CTA CGG 9994 

3182 LKASNKQMSLTPLFEE LLLR 3201 

9995 TGC CCA CCT GCA ACT AAG AGC AAT AAG GGG CAC ATG GCA TCA GCT TAC CAA TTG GCA CAG 10054 

3202 CPPATKSNKGHMASAYQLAQ 3221 

10055 GGT AAC TGG GAG CCC CTC GGT TGC GGG GTG CAC CTA GGT ACA ATA CCA GCC AGA AGG GTG 10114 

3222 GNWEP LGCGVHLGT I PARRV 3241 



10115 AAG ATA CAC CCA TAT GAA GCT TAC CTG AAG TTG AAA GAT TTC ATA GAA GAA GAA GAG AAG 10174 
3242 KIHPYEAYLKLKDFI EEEEK 3261 

10175 AAA CCT AGG GTT AAG GAT ACA GTA ATA AGA GAG CAC AAC AAA TGG ATA CTT AAA AAA ATA 10234 



3262 K 



H 



N 



K 



3281 



10235 AGG TTT CAA GGA AAC CTC AAC ACC AAG AAA ATG CTC AAC CCG GGG AAA CTA TCT GAA CAG 10294 

3282 RFQGN LNTKKMLNPG K LSEQ 3301 

10295 TTG GAC AGG GAG GGG CGC AAG AGG AAC ATC TAC AAC CAC CAG ATT GGT ACT ATA ATG TCA 10354 

3302 LD REGRKRNI YNHQI GTIMS 3321 



10355 AGT GCA GGC ATA AGG CTG GAG AAA TTG CCA ATA GTG AGG GCC CAA ACC GAC ACC AAA ACC 
3322 SAGI R LEKLPIVRAQTDTKT 



10414 
3341 



10415 TTT CAT GAG GCA ATA AGA GAT AAG ATA GAC AAG AGT GAA AAC CGG CAA AAT CCA GAA TTG 10474 
3342 FHEAIRDKIDKSENRQNPEL 3361 
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104 75 CAC AAC AAA" TTG TTC GAG ATT TTC CAC ACG ATA GCC CAA CCC ACC CTG AAA" CAC ACC TAG 10534 



3362 H 



N 



H 



3381 



10535 GGT GAG GTG ACG TGG GAG CAA CTT GAG GCG GGG ATA AAT AGA AAG GGG GCA GCA GGC TTC 10594 

3382 GEVTWEOLEAGI NRKGAA G F 3401 

10595 CTG GAG AAG AAG AAC ATC GGA GAA GTA TTG GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG 10654 

3402 LEK KNIGEVLDSEKHLVEQL 3421 

10655 GTC AGG GAT CTG AAG GCC GGG AGA AAG ATA AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT 10714 

3422 VRDLKAGRKI KYYETAI PKN 3441 



10715 GAG AAG AGA GAT GTC AGT GAT GAC TGG CAG GCA GGG GAC CTG GTG GTT GAG AAG AGG CCA 
3442 EKRDVSDDWQAGDLVVEKRP 

10775 AGA GTT ATC CAA TAC CCT GAA GCC AAG ACA AGG CTA GCC ATC ACT AAG GTC ATG TAT AAC 
3462 RVIQYPEAKTRLAITKVMYN 

10835 TGG GTG AAA CAG CAG CCC GTT GTG ATT CCA GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC 
3482 W VK QQPVV I PGY EGKTP L F N 

10895 ATC TTT GAT AAA GTG AGA AAG GAA TGG GAC TOG TTC AAT GAG CCA GTG GCC GTA AGT TTT 
3502 I FDKVRKEWDSFNE PVAVS F 



10774 
3461 



10834 
3481 



10894 
3501 



10954 
3521 



10955 GAC ACC AAA GCC TGG GAC ACT CAA GTG ACT AGT AAG GAT CTG CAA CTT ATT GGA GAA ATC 11014 



3522 D 



W 



s 



11015 CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG 
3542 QKYYYKKEWHKF I DTIT DHM 

11075 ACA GAA GTA CCA GTT ATA ACA GCA GAT GGT GAA GTA TAT ATA AGA AAT GGG CAG AGA GGG 
3562 TEVPVITADGEVY I RNGQRG 



3541 



11074 
3561 



11134 
3581 



11135 AGC GGC CAG CCA GAC ACA AGT GOT GGC AAC AGC ATG TTA AAT GTC CTG ACA ATG ATG TAC 11194 

3582 SGQPDTSAGNSMLNVLTMMY 3601 

11195 GGC TTC TGC GAA AGC ACA GGG GTA CCG TAC AAG AGT TTC AAC AGG GTG GCA AGG ATC CAC 11254 

3602 GFCESTGVPYKSFNRVARI H 3621 

11255 GTC TGT GGG GAT GAT GGC TTC TTA ATA ACT GAA AAA GGG TTA GGG CTG AAA TTT GCT AAC 11314 

3622 VCGDDGFLITEKGLGLKFAN 3641 

11315 AAA GGG ATG CAG ATT CTT CAT GAA GCA GGC AAA CCT CAG AAG ATA ACG GAA GGG GAA AAG 11374 

3642 KGMQILHEAGKPQKITEGEK 3661 

11375 ATG AAA GTT GCC TAT AGA TTT GAG GAT ATA GAG TTC TGT TCT CAT ACC CCA GTC CCT GTT 11434 



3662 M 



S 



K 



11435 AGG TGG TCC GAC AAC ACC AGT AGT CAC ATG GCC GGG AGA GAC ACC GCT GTG ATA CTA TCA 
3682 RWS DNTS SHMAG RDTAV I LS 

11495 AAG ATG GCA ACA AGA TTG GAT TCA AGT GGA GAG AGG GGT ACC ACA GCA TAT GAA AAA GCG 
3702 KMATRLDSSGERGTTAYEKA 



3681 



11494 
3701 



11554 
3721 



11555 GTA GCC TTC AGT TTC TTG CTG ATG TAT TCC TGG AAC CCG CTT GTT AGG AGG ATT TGC CTG 11614 

3722 V A F SFLLMY SWN P LVRR I C L 3741 

11615 TTG GTC CTT TCG CAA CAG CCA GAG ACA GAC CCA TCA AAA CAT GCC ACT TAT TAT TAC AAA 11674 

3742 LVLSQQPETDPSKHATYYYK 3761 

11675 GGT GAT CCA ATA GGG GCC TAT AAA GAT GTA ATA GGT CGG AAT CTA AGT GAA CTG AAG AGA 11734 

3762 GDP IGAYKDVIGRNLSELKR 3781 

11735 ACA GGC TTT GAG AAA TTG GCA AAT CTA AAC CTA AGC CTG TCC ACG TTG GGG ATC TGG ACT 11794 

3782 TGF EKLANLNLS LSTLG I WT 3801 

11795 AAG CAC ACA AGC AAA AGA ATA ATT CAG GAC TGT GTT GCC ATT GGG AAA GAA GAG GGC AAC 11854 

3802 KHTSKRI IQDCVAIGKEEGN 3821 

11855 TGG CTA GTT AAC GCC GAC AGG CTG ATA TCC AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT 11914 

3822 WLVNADRLI SSKTGHLYI PD 3841 

11915 AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC 11974 

3842 KGFTLQGKHYEQLQLRTETN 3861 

11975 CCG GTC ATG GGG GTT GGG ACT GAG AGA TAC AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG 12034 

3862 PVMGVGT ERYK LG P I VNL L L 3881 

12035 AGA AGG TTG AAA ATT CTG CTC ATG ACG GCC GTC GGC GTC AGC AGC TGA gacaaaatgtatatat 12098 

3882 RRLKI LLMTAVGVSS- 3897 

12099 tgtaaataaattaatccatgtacatagtgtatataaatatagttgggaccgtccaccccaagaagacgacacgcccaaca 12178 

12179 cgcacagccaaacagtagccaagactacccaccccaagataacaccacacccaacgcacacagcaccccagctgcacgag 12258 

12259 gacacgcccgacgcccacagtcggaccagggaagacccccaacagccccc 12308 



E 
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GTATaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgnagtatgagtgtcgtgcagcctccag 
gaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgaccgggtcctttcttggata 
aacccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaaaggccngiggtactgc 
ctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 13 



BNSDOCID: <WO 9955366A1J_> 



WO 99/55366 




PCT/US99/08850 



38/67 



GTaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgagtgtcgtgcagcctccaggac 
cccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgaccgggtcctttcttggataaac 
ccgctcaatgcctggagamgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaaaggccttgtggtactgcctg 
atagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 14 
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GTATacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgtiagtatgagtgtcg 
tgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgaccggg 
tcctucnggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccgagtagtgtigggtcgcgaaaggc 
cttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 15 
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GTATCAGAAGTGCGAATGCTGAacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaa 

gcgtctagccatggcgttagtatgagtgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtg 

agtacaccggaattgccaggacgaccgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactg 

ctagccgagtagtgttgggtcgcgaaaggccttgtggtactgcctgatagggtgcttgcgagigccccgggaggtctcgtagaccgtg 

caccATG 



FIGURE 16 
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GTATgccagccccctgatgggggcgacactccaccatgaaicactcccctgtgaggaactactgtcttcacgcagaaagcgtctag 

ccatggcgttagmtgagtg^cgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacacc 

ggaattgccaggacgaccgggtcctttcttggataaacccgctcaatgcciggagatttgggcgtgcccccgcaagactgctagccga 

gtagtgttgggtcgcgaaaggccttgtggtactgcctgatagggtgcttgcgagtgcccxgggaggtctcgtagaccgtgcaccAT 

G 



FIGURE 17 



BNSDOCID: <WO 9955366A1 J_> 



WO 99/55366 




PCT/US99/08850 



42/67 



GTATTGCAGTTTgccagccccctgatgggggcgacactccaccatgaatcactcccctgtgaggaactacigtcttcacgc 

agaaagcgtctagccatggcgttagtatgagtgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaac 

cggtgagtacaccggaattgccaggacgaccgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaa 

gactgctagccgagtagtgttgggtcgcgaaaggccugtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtaga 

ccgtgcaccATG 



FIGURE 18 
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GTATTGCAGTTTgccagccccctgatgggggcgacactccaccatgaatcactcccctgtgaggaactactgtcttcacgc 

agaaagcgtctagccatggcgttagtatgagtgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaac 

cggtgagtacaccggaattgccaggacgacxgggtcctttcttggataaacccgctcaatgcctggagamgggcgtgcccccgcaa 

gactgctagccgagtagtgttgggtcgcgaaaggccttgtggtactgcctgatagggtgcngcgagtgccccgggaggtctcgtaga 

ccgtgcarcATGGAGTTGATCACAAATGAACITTTATACAAAACATACAAACAAAAAC 

CCGTCGGGGTGGAGGAACCTGTTTATGATCAGGCAGGTGATCCCTTATTTGGT 

GAAAGGGGAGCAGTCCACCCTCAATCGACGCTAAAGCTCCCACACAAGAGAG 

GGGAACGCGATGTTCCAACCAACTTGGCATCCTTACCAAAAAGAGGTGACTGC 

AGGTCGGGTAATAGCAGAGGACCTGTGAGCGGGATCTACCTGAAGCCAGGGC 

CACTATTTTACCAGGACTATAAAGGTCCCGTCTATCACAGGGCCCCGCTGGAGC 

TCTTTGAGGAGGGATCCATGTGTGAAACGACTAAACGGATAGGGAGAGTAACT 

GGAAGTGACGGAAAGCTGTACCACATTTATGTGTGTATAGATGGATGTATAATA 

ATAAAAAGTGCCACGAGAAGTTACCAAAGGGTGTTCAGGTGGGTCCATAATAG 

GCTTGACTGCCCTCTATGGGTCACAACTTGCTCAGACACGAAAGAAGAGGGAG 

CAACAAAAAAGAAAACACAGAAACCCGACAGACrAGAAAGGGGGAAAATGAA 

AATAGTGCCCAAAGAATCTGAAAAAGACAGCAAAACTAAACCTCCGGATGCrA 

CAATAGTGGTGGAAGGAGTCAAATACCAGGTGAGGAAGAAGGGAAAAACCAA 

GAGTAAAAACACrCAGGACGGCTTGTACCATAACAAAAACAAACCTCAGGAAT 

C ACGC AAG AAACTGG AAAAAGC ATTGTTGGCGTGGGC AAT AAT AGCT AT AGTT 

TTGTTTCAAGTTACAATGGGAGAAAACATAACACAGTGGAACCTACAAGATAAT 

GGGACGGAAGGGATACAACGGGCAATGTTCCAAAGGGGTGTGAATAGAAGTT ~ 

TACATGGAATCTGGCCAGAGAAAATCTGTACTGGTGTCCCTTCCCATCTAGCCA Os 

CCGATATAGAACTAAAAACAATTCATGGTATGATGGATGCAAGTGAGAAGACC 

AACTACACGTGTTGCAGACTTCAACGCCATGAGTGGAACAAGCATGGTTGGTG g 

CAACTGGTACAATATTGAACCCTGGATTCTAGTCATGAATAGAACCCAAGCCAA 3 

TCTCACTGAGGGACAACCACCAAGGGAGTGCGCAGTCACTTGTAGGTATGATA U 

GGGCTAGTGACTTAAACGTGGTAACACAAGCTAGAGATAGCCCCACACCCTTA £ 

ACAGGTTGCAAGAAAGGAAAGAACTTCTCCTTTGCAGGCATATTGATGCGGGG 

CCCCTGCAACTTTGAAATAGCTGCAAGTGATGTATTATTCAAAGAACATGAACG 

CATTAGTATGTTCCAGGATACTACTCITTACCTTGTTGACGGGTTGACCAACTCC 

TTAGAAGGTGCCAGACAAGGAACCGCTAAACTGACAACCTGGTTAGGCAAGCA 

GCTCGGGATACTACK3AAAAAAGTTGGAAAACAAGAGTAAGACGTGGTTTGGAG 

CATACGCTGCTTCCCCTTACTGTGATGTCGATCGCAAAATTGGCTACATATGGT 

ATACAAAAAATTGCACCCCTGCCTGCTTACCCAAGAACACAAAAATTGTCGGCC 

CTGGGAAATTTGACACCAATGCAGAGGACGGCAAGATATTACATGAGATGGGG 

GGTCACTTGTCGGAGGTACTACTACTTTCTTTAGTGGTGGTGTCCGACTTCGCA 

CCGGAAACAGCTAGTGTAATGTACCTAATCCTACATTTTTCCATCCCACAAAGTC 

ACGTTGATGTAATGGATTGTGATAAGACCCAGTTGAACCTCACAGTGGAGCTG 

ACAACAGCTGAAGTAATACCAGGGTCGGTCTGGAATCTAGGCAAATATGTATG 

TATAAGACCAAATTGGTGGCCTTATGAGACAACTGTAGTGTTGGCATTTGAAGA 

GGTGAGCCAGGTGGTGAAGTTAGTGTTGAGGGCACTCAGAGATTTAACACGCA 

TTTGG AACGCTGC A AC AACTACTGC 111111 AGTATGCCTTGTT AAG ATAGTC AG 

GGGCCAGATGGTACAGGGCATTCTGTGGCTACTATTGATAACAGGGGTACAAG 

GGCACTTGGATTGCAAACCTGAATTCTCGTATGCCATAGCAAAGGACGAAAGA 

ATTGGTCAACTGGGGGCTGAAGGCCTTACCACCACTTGGAAGGAATACTCACC 

TGGAATGAAGCTGGAAGACACAATGGTCATTGCTTGGTGCGAAGATGGGAAGT 

TAATGTACCTCCAAAGATGCACGAGAGAAACCAGGTA TCTC GCAATCTTGCATA 

CAAGAGCCTTGCCGACCAGTGTGGTATTCAAAAAACTCTTTGATGGGCGAAAG 
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CAAGAGGATGTAGTCGAAATGAACGACAACTTTGAATTTGGACTCrGCCCATGT 

GATGCCAAACCCATAGTAAGAGGGAAGTTCAATACAACGCrGCTGAACGGACC 

GGCCTTCCAGATGGTATGCCCCATAGGATGGACAGGGACTGTAAGCTGTACGT 

CATTCAATATGGACACCTTAGCCACAACTGTGGTACGGACATATAGAAGGTCTA 

AACCATTCCCTCATAGGCAAGGCTGTATCACCCAAAAGAATCTGGGGGAGGAT 

CTCCATAACTGCATCCTTGGAGGAAATTGGACTTGTGTGCCTGGAGACCAACTA 

CTATACAAAGGGGGCTCTATTGAATCTTGCAAGTGGTGTGGCTATCAATTTAAA 

GAGAGTGAGGGACTACCACACTACCCCATTGGCAAGTGTAAATTGGAGAACGA 

GACTGGTTACAGGCTAGTAGACAGTACCTCTTGCAATAGAGAAGGTGTGGCCA 

TAGTACCACAAGGGACATTAAAGTGCAAGATAGGAAAAACAACTGTACAGGTC 

ATAGCTATGGATACCAAACTCGGACCTATGCCTTGCAGACCATATGAAATCATA 

TCAAGTGAGGGGCC TGTAG AAAAGACAGCGTGTACTTTCAACTACACTAAGAC 

ATTAAAAAATAAGTATTTTGAGCCCAGAGACAGCTACTTTCAGCAATACATGCT 

AAAAGGAGAGTATCAATACTGGTTTGACCTGGAGGTGACTGACCATCACCGGG 

ATTAC TTCG CTGAGTCCATATTAGTGGTGGTAGTAGCCCTCTTGGGTGGCAGAT 

ATGTACTTTGGTTACTGGTTACATACATGGTCTTATCAGAACAGAAGGCCTTAG 

GGATTCAGTATGGATCAGGGGAAGTGGTGATGATGGGCAACTTGCTAACCCAT 

AACAATATTGAAGTGGTGACATACTTCTTGCTGCTGTACCTACTGCTGAGGGAG 

GAGAGCGTAAAGAAGTGGGTCTTACTCTTATACCACATCITAGTGGTACACCCA 

ATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGTGGTAAAGGCCGAT 

TCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGTTTTACAACAGTAGT 

ACTAATCGTCATAGGTTTAATCATAGCCAGGCGTGACCCAACTATAGTGCCACT 

GGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACCCACCAGCCTGGAG 

TTGACATCGCTGTGGCGGTCATGACTATAACCCTACTGATGGTTAGCTATGTGA 

CAGATTATTTTAGATATAAAAAATGGTTACAGTGCATTCTCAGCCTGGTATCTGC 

GGTGTTCTTGATAAGAAGCCTAATATACCTAGGTAGAATCGAGATGCCAGAGG <m 

TAACTATCCCAAACTGGAGACCACTAACTTTAATACTATTATATTTGATCTCAAC Ch 

AACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCTATTGTTGCAATGTG 7*. 

TGCCTATCTTATTGCTGGTCACAACCTTGTGGGCCGACTTCTTAACCCTAATACT 3 

GATCCTGCCTACCTATGAATTGGTTAAATTATACTATCTGAAAACTGTTAGGACT 3 

GATATAGAAAGAAGTTGGCrAGGGGGGATAGACTATACAAGAGTTGACTCCAT O 

CTACGACGTTGATGAGAGTGGAGAGGGCGTATATCTTTTTCCATCAAGGCAGA £ 

AAGCACAGGGGAATTTTTCTATACTCTTGCCCCTTATCAAAGCAACACTGATAA 

GTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTTACTTAACnTGGACT 

TTATGTACTACATGCACAGGAAAGTTATAGAAGAGATCTCAGGAGGTACCAACA 

TAATATCCAGGTTAGTGGCAGCAC TCATA GAGCTGAACTGGTCCATGGAAGAA 

GAGGAGAGCAAAGGCTTAAAGAAGTTTTATCTATTGTCTGGAAGGTTGAGAAA 

CCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCTTCTTGGTACGGGG 

AGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATCAAGGCCAGTACA 

CTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGAGGGCCGAGAGTG 

GAAAGGTGGCACCTGC CCAAA ATGTGGACGCCATGGGAAGCCG ATAA CGTGT 

GGGATGTCGCTAGCAGATTTTGAAGAAAGACACTATAAAAGAATCTTTATAAGG 

GAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAAAGCATAGGAGGT 

TTGAAATGGACCGGGAACCTAAG AGTGC CAGATACTGTGCTGAGTGTAATAGG 

CTGCATCCTGCTGAGGAAGGTGACTTTTGGGCAGAGTCGAGCATGTTGGGCCT 

CAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGATATCACAGAGTG 

GGCTGGATGCCAGCGTGTGGGAA TCTC CCCAGATACCCACAGAGT CCCT TGTC 

ACATCTCATTTGGTTCACGGATGCCTTTCAGGCAGGAATACAATGGCTTTGTAC 

AATATACCGCTAGGGGGCAACTATTTCTGAGAAACTTGCCCGTACTGGCAACTA 

AAGTAAAAATGCTCATGGTAGGCAACCITGGAGAAGAAATTGGTAATCrGGAA 

CATCTTGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAG AAGAT CACAGAGCA 

CGAAAAATGCCACATTAATATACTGGATAAACTAACCGCATTTTTCGGGATCAT 

GCCAAGGGGGACTACACCCAGAGCCCCGGTGAGGTTCCCTACGAGCTTACTAA 

AAGTGAGGAGGGGTCTGGAGACTGCCTGGGCTTACACACACCAAGGCGGGAT 
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AAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGTCTGTGACAGCA 

TGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGACCGATGAGACA 

GAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGCCAGATGTTATGT 

GTTAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGGCAGTCGTTCACC 

TCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCAGGCACACCGGCT 

TTCTTCGACCTAAAAAACTTGAAAGGATGGTCAGGCTTGCCTATATTTGAAGCC 

TCCAGCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGAATGAAGAGTCTA 

AACCTACAAAAATAATGAGTGGAATCCAGACCGTCrCAAAAAACAGAGCAGAC 

CTGACCGAGATGGTCAAGAAGATAACCAGCATGAACAGGGGAGACTTCAAGCA 

GATTACTTTGGCAACAGGGGCAGGCAAAACCACAGAACTCCCAAAAGCAGTTA 

TAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATACCATTAAG GGCA 

GCGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCCAAGCATCTCTTTT 

AACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAACCGGGATAACCT 

ATGCATCATACGGGTACTTCrGCCAAATGCCTCAACCAAAGCTCAGAGCTGCTA 

TGGTAGAATACTCATACATATTCTTAGATGAATACCATTGTGCCACTCCTGAACA 

ACTGGCAATTATCGGGAAGATCCACAGATTTTCAGAGAGTATAAGGGTTGTCG 

CCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGTCAAAAGCACCCA 

ATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGGATCTTGGTAGTCA 

GTTCCTTGATATAGCAGGGTTAAAAATACCAGTGGATGAGATGAAAGGCAATAT 

GTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGTAGCAAAGAAGCTAA' 

AAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGAGGATCCAGCCAAT 

CTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGCTACAAATGCTATT 

GAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGACACGGGGTTGAAi 

ATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCITCATCGTAACAGGCC-. 

TTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCGTAGGGGCAGAGT 

AGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAAACAGCAACAGGG. 

TCAAAGGACTACCACTATGACCTCTTGCAGGCACAAAGATACGGGATTGAGGA" V 

tggaatcaacgtgacgaaatcctttagggagatgaattacgattggagcctata s 

cgaggaggacagcctactaataacccagctggaaatactaaataatctactcat a 

ctcagaagacttgccagccgctgttaagaacataatggccaggactgatcacc 06 

cagagccaatccaacttgcatacaacagctatgaagtccaggtcccggtcctgt => 

tcccaaaaataaggaatggagaagtcacagacacctacgaaaattactcgtttc £ 

taaatgccagaaagttaggggaggatgtgcccgtgtatatctacgctactgaa ts. 

gatgaggatctggcagttgacctcttagggctagactggcctgatcctgggaa 

ccagcaggtagtggagactggtaaagcactgaagcaagtgaccgggttgtcct 

cggctgaaaatgccctactagtggctttatttgggtatgtgggttaccaggctc 

tcrcaaagaggcatgtcccaatgataacagacatatataccatcgaggaccaga 

gactagaagacaccacccacctccagtatgcacccaacgccataaaaaccgat 

gggacagagactgaactgaaagaactggcgtcgggtgacgtggaaaaaatca * 

tgggagccatttcagattatgcagctgggggactggagtttgttaaatcccaa 

gcagaaaagataaaaacagctcctttgtttaaagaaaacgcagaagccgcaaa 

agggtatgtccaaaaattcattgacrcattaattgaaaataaagaagaaataat 

cagatatggtttgtggggaacacacacagcactatacaaaagcatagc tgcaa 

g actggggc atg aaac agcgtttgcc ac act agtgtt aaagtggct agg 1 t it 

ggaggggaatcagtgtcagaccacgtcaagcaggcggcagttgatttagtgg 

tctattatgtgatgaataagccttccttcccaggtgactccgagacacagcaag 

aagggaggcgattcgtcgcaagcctgttcatctccgcactggcaacctacaca 

tacaaaacttggaattaccacaatctctctaaagtggtggaaccagccctggct 

tacctcccctatgctaccagcgcattaaaaatgttcaccccaacgcggctggag 

agcgtggtgatactgagcaccacgatatataaaacatacctctctataaggaag 

gggaagagtgatggattgctgggtacggggataagtgcagccatggaaatcc 

tgtcacaaaacccagtatcggtaggtatatctgtgatgttgggggtaggggca 

atcgctgcgcacaacgctattgagtccagtgaacagaaaaggaccctacttat 

gaaggtgtttgtaaagaacttcttggatcaggctgcaacagatgagctggtaa 
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AAGAAAACCCAGAAAAAATTATAATGGCCTTATTTGAAGCAGTCCAGACAATTG 

GTAACCCCCTGAGACTAATATACCACCTGTATGGGGTTTACTACAAAGGTTGGG 

AGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACTTATTCACATTGATA 

ATGTTTGAAGCCTTCGA GTTA TTAGGGATGGACTCACAAGGGAAAATAAGGAA 

CCTGTCCGGAAATTACATTTTGGATTTGATATACGGCCTACACAAGCAAATCAA 

CAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGCACCCTTTAGTTGTG 

ACTGGACCCCTAGTGACGAGAGGATCAGATTGCCAACAGACAACTATTTGAGG 

GTAGAAACCAGGTGCCCATGTGGCTATGAGATGAAAGCTTTCAAAAATGTAGG 

TGGCAAACTTACCAAAGTGGAGGAGAGCGGGCCnTCCTATGTAGAAACAGAC 

CTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATTACGATGACAACCTC 

AGAGAGATAAAACCAGTAGCAAAGTTGGAAGGACAGGTAGAGCACTACTACAA 

AGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAAAATGCTCTTGGCCACTG 

ACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAGCTAAGAGATATACT 

GGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCCCAATCACCGTGCTCT 

AGTGGAGAGGGACTGTGCAACTATAACCAAAAACACAGTACAGTTTCTAAAAAT 

GAAGAAGGGGTGTGCGTTCACCTATGACCTGACCATCTCCAATCTGACCAGGC 

TCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGGAAATACCCACCGCT 

ACGGTCACCACATGGCTAGCTTACACCTTCGTGAATGAAGACGTAGGGACTAT 

AAAACCAGTACTAGGAGAGAGAGTAATCCCCGACCCTGTAGTTGATATCAATTT 

ACAACCAGAGGTGCAAGTGGACACGTCAGAGGTTGGGATCACAATAATTGGAA 

GGGAAACCCTGATGACAACGGGAGTGACACCTGTCTTGGAAAAAGTAGAGCCT 

GACGCCAGCGACAAGCAAAACTCGGTGAAGATCGGGTTGGATGAGGGTAATTA 

CCCAGGGCCTGGAATACAGACACATACACTAACAGAAGAAATACACAACAGGG 

ATGCGAGGCCCTTCATCATGATCCTGGGCTCAAGGAATTCCATATCAAATAGGG 

CAAAGACTGCTAGAAATATAAATCTGTACACAGGAAATGACCCCAGGGAAATA 

CGAGACTTGATGGCTGCAGGGCGCATGTTAGTAGTAGCACTGAGGGATGTCGA 

CCCTG AGCTGTCTGAAATGGTCG ATTTCAAGGGG A C 1 ' 1 ' 1 " 1 ' 1 ' 1 AG AT AGGG AGG 

CCCTGGAGGCTCTAAGTCTCGGGCAACCTAAACCGAAGCAGGTTACCAAGGAA 

GCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAGATCCCTAACTGGTTT 

GCATCAGATGACCCAGTATTTCTGGAAGTGGCCTTAAAAAATGATAAGTACTAC 

TTAGTAGGAGATGTTGGAGAGCTAAAAGATCAAGCTAAAGCACTTGGGGCCAC 

GGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGGACGTATGCCATGAAGC 

TATCTAGCTGGTTCCTCAAGGCATCAAACAAACAGATGAGTTTAACTCCACTGT 

TTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAGAGCAATAAGGGGCAC 

ATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAGCCCCTCGGTTGCGG 

GGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGATACACCCATATGAAG 

CTTACCTGAAGTTGAAAGATTTCATAGAAGAAGAAGAGAAGAAACCTAGGGTT 

AAGGATACAGTAATAAGAGAGCACAACAAATGGATACTTAAAAAAATAAGGTTT 

CAAGGAAACCTCAACACCAAGAAAATGCTCAACCCGGGGAAACTATCTGAACA 

GTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCACCAGATTGGTACT 

ATAATGTCAAGTGCA GGCA TAAGGCTGGAGAAATTGCCAATAGTGAGGGCCCA 

AACCGACACCAAAACCTTTCATGAGGCAATAAGAGATAAGATAGACAAGAGTG 

AAAACCGGCAAAATCCAGAATTGCACAACAAATTGTTGGAGATTTTCCACACGA 

TAGCCCAACCCACCCTGAAACACACCTACGGTGAGGTGACGTGGGAGCAACTT 

GAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCTTCCTGGAGAAGAAGAACA 

TCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGAACAATTGGTCAGGGAT 

CTGAAGGCCGGGAGAAAGATAAAATATTATGAAACTGCAATACCAAAAAATGA 

GAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGACCTGGTGGTTGAGAAG 

AGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAAGGCTAGCCATCACTAA 

GGTCATGTATAACTGGGTGAA ACAGC AGCCCGTTGTGATTCCAGGATATGAAG 

GAAAGACCCCCTTGTTCAACAT CTTTG ATAAAGTGAGAAAGGAATGGGACTCGT 

TCAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGCCTGGGACACTCAAGTG 

ACTAGTAAGGATCTGCAACTTATTGGAGAAATCCAGAAATATTACTATAAGAAG 

GAGTGGCACAAGTTCATTGACACCATCACCGACCACATGACAGAAGTACCAGT 
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TATAACAGCAGATGGTGAAGTATATATAAGAAATGGGCAGAGAGGGAGCGGC 

CAGCCAGACACAAGTGCTGGCAACAGCATGTTAAATGTCCTGACAATGATGTA 

CGGCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGTTTCAACAGGGTGGCAA 

GGATCCACGTCTGTGGGGATGATGGCTTCTTAATAACTGAAAAACKjGTTAGGG 

CTGAAATTTGCTAACAAAGGGATGCAGATTCTTCATGAAGCAGGCAAACCTCAG 

AAGATAACGGAAGGGGAAAAGATGAAAGTTGCCTATAGATTTGAGGATATAGA 

GTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCCGACAACACCAGTAGTCA 

CATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATGGCAACAAGATTGG 

ATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAAAAGCGGTAGCCTTCAGT 

TTCITGCTGATGTATTCCTGGAACCCGCTTGTTAGGAGGATTTGCCTGTTGGTC 

CirTCGCAACAGCCAGAGACAGACCCATCAAAACATGCCACTTATTATTACAAA 

GGTGATCCAATA GGGG CCTATAAAGATGTAATAGGTCGGAATCTAAGTGAACT 

GAAGAGAACAGGCTTTGAGAAATTGGCAAATCTAAACCTAAGCCT 

GGGGATCTGGACTAAGCACACAAGCAAAAGAATAATTCAGGACTGTGTTGCCA 

TTGGGAAAGAAGAGGGCAACTGGCTAGTTAACGCCGACAGGCTGATATCCAGC 

AAAACTGGCCACITATACATACCTGATAAAGGCITTACATT 

TATGAGCAACTGCAGCTAAGAACAGAGACAAACCCGGTCATGGGGGTTGGGA 

CTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCTGCTGAGAAGGTTGAAA 

ATTCT GCTCATGACGGCCGTCGGCGTC AGCAGCTGAaggttggggta^ 

cttccttctttaatggtggctccatcttagccctagfcacggctagctgt^ 

ggcctctctgcagatcatgtCCCCCGGCCGTCGGCGTCAGCTGAgacaaaatgtatatattgtaaataaattaatc 
catgtacamg^gtatataaatatagttgggaccg^ccacctcaagaagacgacacgcccaacacgcacagctaaacag^agtcaagatt 
atctacctcaagataacactacatttaatgcacacagcactttagctgtatgaggatacgcccgacgtctatagttggactagggaa 
ctaacagccccc 
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Gtatacgagaattagaaaaggcactcgtatacgtattgggcaattaaaaataataattaggcctaggtacatggcacgtgccagccccct 

gatgggggcgacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgag 

tgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgac 

cgggtccmcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactgciagccgagtagtgttgggtcgcgaa 

aggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATGAGCACGAATC 

CTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGTCGCCCACAGGACGTC 

AAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAG 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGACGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGTTGCGGG 

TGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGGCCTAGCTGGGGCCCCAC 

AGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTACGT 

GCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTTGGA 

GGCGCTGCCAGGGCCCTGGCGCATGGCGT CCGG GTTCTGGAAGACGGCGTGA 

ACTATGC AACAGGG AACCTTCCTGGTTGCTU i I" i CTCT ATCTTCCTTCTGGCCCT 

GCTCTCITGCCTGACCGTGCCCGCTTCAGCCTACCAAGTGCGCAATTCCTCGGG 

GCTTTACCATGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGCGGC 

CGATGCCATCCTGCACACTCCGGGGTGTGTCCCTTGCGTTCGCGAGGGTAACG 

CCTCGAGGTGTTGGGTGGCGGTGACCCCCACGGTGGCCACCAGGGACGGCAA 

ACTCCCCACAACGCAGCTTCGACGTCATATCGATCTGCTTGTC GGGAG CGCCA 

CCCTCTGC TCG GCCCTCTACGTGGGGGACCTGTGCGGGTCTGTCl'l'rCl'rGTTG 

GTCAACTGTTTACCTTCTCTCCCAGGCGCCACTGGACGACGCAAGACTGCAATT 

GTTCTATCTATCCCGGCCATATAACGGGTCATCGCATGGCATGGGATATGATGA 

TGAACTGGTCCCCTACGGCAGCGTTGGTGGTAGCTCAGCTGCTCCGGATCCCA 

CAAGCCATCATGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCAT 

AGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGC 

TATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCG 

CACCACGGCTGGGCTTGTTGGTCTCCTTACACCAGGCGCCAAGCAGAACATCC 

AACTGATCAACACCAACGGCAGTTGGCACATCAATAGCACGGCCTTGAACTGC 

AATGAAAGCCTTAACACCGGCTGGTTAGCAGGGCTCTTCTATCAGCACAA ATTC 

AACTCTTCAGGCTGTCCTGAGAGGTTGGCCAGCTGCCGACGCCTTACCGATTTT 

GCCCAGGGCTGGGGTCCTATCAGTTATGCCAACGGAAGCGGCCTCGACGAAC 

GCCCCTACTGCTGGCACTACCCTCCAAGACCTTGTGGCATTGTGCCCGCAAAG 

AGCGTGTGTGGCCCGGTATATTGCTTCACTCCCAGCCCCGTGGTGGTGGGAAC 

GACCGACAGGTCGGGCGCGCCTACCTACAGCTGGGGTGCAAATGATACGGAT 

GTCTTCGTCCTTAACAACACCAGGCCACCGCTGGGCAATTGGTTCGGTTGTACC 

TGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCCCCTTGTGTCAT 

CGGAGGGGTGGGCAACAACACCTTGCTCTGCCCCACTGATTGTTTCCGCAAGC 

ATCCGGAAGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATTACACCCAGG 

TGCATGGTCGACTACCCGTATAGGCTrTGGCACTATCCTTGTACCATCAATTAC 

ACCATATTCAAAGTCAGGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAG 

CGGCCTGCAACTGGACGCGGGGCGAACGCTGTGATCTGGAAGACAGGGACAG 

GTCCGAGCTCAGCCCATTGCTGCTGTCCACCACAGAGTGGCAGGTCCTTCCGT 

GTTCTTTCACGACCCTGCCAGCCTTGTCCACCGGCCTCATCCACCTCCACCAGA 

ACATTGTGGACGTGCAGTACTTGTACGGGGTAGGGTCAAGCATCGCGTCCTGG 

GCCATTAAGTGGGAGTACGTCGTTCTCCTGTTCCTCCTGCTTGCAGACG CGCG C 

GTCTGCTCCTGCTTGTGGATGATGTTACTCATATCCCAAGCGGAGGCGGCTTTG 

GAGAACCTCGTAATACTCAATGCAGCATCCCTGGCCGGGACGCACGGTCTTGT 

GTCCTTCCTCGTGTTCTTCTGCTTTGCGTGGTATCTGAAGGGTAGGTGGGTGCC 
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CGGAGCGGTCTACGCCTTCTACGGGAAGTGG 

AGTGGTACACCCAATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGT 

GGTAAAGGCCGATTCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGTT 

TTACAACAGTAGTACTAATCGTCATAGGTTTAATCATAGCTAGGCGTGACCCAA 

CTATAGTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACrGACC 

CACCAGCCTGGAGTTGACAT CGCTG TGGCGGTCATGACTATAACCCTACTGAT 

GGTTAGCTATGTGACAGATTATTTTAGATATAAAAAATGGTTACAGTGCATTCTC 

AGCCTGGTATCTGCGGTGTTCTTGATAAGAAGCCTAATATACCTAGGTAGAATC 

GAGATGCCAGAGGTAACTATCCCAAACTGGAGACCACTAACTTTAATACTATTA 

TATTTGATCTCAACAACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCTA 

TTGTTGCAATGTGTGCCTATCTTATTGCTGGTC^ 

TAACCCTAATACTGATCCTGCCrACCTATGAATTGGTTAAATTATACTATCT 

AACTGTTAGGACTGATACAGAuAAGAAGTTGGCTAGGGGGGATAGACTATACAA 

GAGTTGACTCCATCTACGACGTTGA TGAGA GTGGAGAGGGCGTATATCTTTTTC 

CATCAAGGCAGAAAGCACAGGGGAATTTTTCTATACTCTTGCC^ 

CAACACTGATAAGTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTTACT 

TAACTTTGGACTTTATGTACTACATGCACAGGAAAGTTATAGAAG 

GAGGTACCAACATAATATCCAGGTTAGTGGCAGCACTCATAGAGCTGAACTGG 

TCCATGGAAGAAGAGGAGAGCAAAGGCITAAAGAAGTTTTATCTATTGTCTGG 

AAGGTTGAGAAACCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCTT 

CTTGGTACGGGGAGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATC 

AAGGCCAGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGA 

GGGCCGAGAGTGGAAAGGTGGCACCTGC CCAA AATGTGGACGCCATGGGAAG 

CCGATAACGTGTGGGATGTCG CTAG CAGATTTTGAAGAAAGACACTATAAAAG 

AATCTTTATAAGGGAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAA 

AGCATAGGAGGTTTGAAATGGACCGGGAACCTAAGAGTGCCAGATACTGTGCT 

GAGTGTAATAGGCTGCATCCTGCT GACK jAAGGTGACIU'IUGGGCAGAGTCGAG 

CATGTTGGGCCTCAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGA 

TATCACAGAGTGGGCTGGATGCCAGCGTGTGGGAATCTCCCCAGATACCCACA 

GAGTCCCTTGTCACATCTCATTTGGTTCACGGATGCCTTTCAGGCAGGAATACA 

ATGGCTTTGTACAATATACCGCTAGGGGGCAACTATTTCTGAGAAAC^ 

TACTGGCAACTAAAGTAAAAATGCTCATGGT^ 

GGTAATCTGGAACATCTTGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAGAA 

GATCACAGAGCACGAAAAATGCCACATTAATATACTGGATAAACTAACCGCATT 

TTTCGGGATCATGCCAAGGGGGACTACACCCAGAGCCCCGGTGAGGTTCCCTA 

CGAGCTTACTAAAAGTGAGGAGGGGTCTGGAGACrGCCTGGGCTTACACACAC 

CAAGGCGGGATAAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGT 

CTGTGACAGCATGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGA 

CCGATGAGACAGAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGC 

CAGATGTTATGTGTTAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGG 

CAGTCGTTCACCTCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCA 

GGCACACCGGC14ULU4CGACC7rAAAAAACTTGAAAGGATGGTCAGGCTTGCCT 

ATATTTGAAGCCTCCAGCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGA 

ATGAAGAGTCTAAACCTACAAAAATAATGAGTGGAATCCAGACCGTCTCAAAAA 

ACAGAGCAGACCTGAC CGAGA TGGTCAAGAAGATAACCAGCATGAACAGGGG 

AGACTTCAAGCAGATTACTTTGGCAACAGGGGCAGGCAAAACCACAGAACTCC 

CAAAAGCAGTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATA 

CCATTAAGGGCAGCGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCC 

AAGCATCTC 1 T 1TAACCTAAGG ATAGGGGACATG AAAGAGGGGG AC ATGGCAA 

CCGGGATAACCTATGCATCATACGGGTACTTCTGCCAAATGCCTCAACCAAAGC 

TCAGAGCTGCTATGGTAGAATACTCATACATATTCTTAGATGAATACCATTGTGC 

CACTCCTGAACAACTGGCAATTATCGGGAAGATCCACAGATTTTCAGAGAGTAT 

AAGGGTTGTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGT 

CAAAAGCACCCAATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGG 
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ATCTTGGTAGTCAGTTCCTTGATATAGCAGGGTTAAAAATACCAGTGGATGAGA 

TGAAAGGCAATATGTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGTA 

GCAAAGAAGCTAAAAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGA 

GGATCCAGCCAATCTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGC 

TACAAATGCTATTGAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGA 

CACGGGGTTGAAATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCA 

TCGTAACAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCG 

TAGGGGCAGAGTAGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAA 

ACAGCAACAGGGTCAAAGGACTACCACTATGACCTCTTGCAGGCACAAAGATA 

CGGGATTGAGGATGGAATCAACGTGACGAAATCCTTTAGGGAGATGAATTACG 

ATTGGAGCCTATACGAGGAGGACAGCCTACTAATAACCCAGCTGGAAATACTA 

AATAATCTACTCATCTCAGAAGACTTGCCAGCCGCTGTTAAGAACATAATGGCC 

AGGACTGATCACCCAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCA 

GGTCCCGGTCCTATTCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACG 

AAAATTACTCGTTTCTAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTATA 

TCTACGCTACTGAAGATGAGGATCTGGCAGTTGACCTCTTAGGGCTAGACTGG 

CCTGATCCTGGGAACCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGT 

GACCGGGTTGTCCTCGGCTGAAAATGCCCTACTAGTGGCTTTATTTGGGTATGT 

GGGTTACCAGGCTCTCTCAAAGAGGCATGTCCCAATGATAACAGACATATATAC 

CATCGAGGACCAGAGACTAGAAGACACCACCCACCTCCAGTATGCACCCAACG 

CCATAAAAACCGATGGGACAGAGACTGAACTGAAAGAACTGGCGTCGGGTGA 

CGTGGAAAAAATCATGGGAGCCATTTCAGATTATGCAGCTGGGGGACTGGAGT 

TTGTTAAATCCCAAGCAGAAAAGATAAAAACAGCTCCTTTGTTTAAAGAAAACG 

CAGAAGCCGCAAAAGGGTATG TCCA AAAATTCATTGACTCATTAATTGAAAATA 

AAGAAGAAATAATCAGATATGGTTTGTGGGGAACACACACAGCACTATACAAA 

AGCATAGC TGCAA GACTGGGGCATGAAACAGCGTTTGCCACACTAGTGTTAAA 

GTGGCTAGCTTTTGGAGGGGAATCAGTGTCAGACCACGTCAAGCAGGCGGCA 

GTTGATTTAGTGGTCTATTATGTGATGAATAAGCCTTCCTTCCCAGGTGACTCC 

GAGACACAGCAAGAAGGGAGGCGATTCGTCGCAAGCCTGTTCATCTCCGCACT £J 

GGCAACCTACACATACAAAACTTGGAATTACCACAATCTCTCTAAAGTGGTGGA ^ 

ACCAGCCCTGGCTTACCTCCCCTATGCTACCAGCGCATTAAAAATGTTCACCCC g 

AACGCGGCTGGAGAGCGTGGTGATACTGAGCACCACGATATATAAAACATACC B 

TCTCTATAAGGAAGGGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGC O 

AGCCATGGAAATCCTGTCACAAAACCCAGTATCGGTAGGTATATCTGTGATGTT £ 

GGGGGTAGGGGC AATCGCTGCGCACAACGCTATTGAGTCCAGTGAACAGAAA 

AGGACCCTACTTATGAAGGTGTTTGTAAAGAACTTCTTGGATCAGGCTGCAACA 

GATGAGCTGGTAAAAGAAAACCCAGAAAAAATTATAATGGCCTTATTTGAAGCA 

GTCCAGACAATTGGTAACCCCCTGAGACTAATATACCACCTGTATGGGGTTTAC 

TACAAAGGTTGGGAGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACT 

TATTCACATTGATAATGTTTGAAGCCTTCGAGTTATTAGGGATGGACTCACAAG 

GGAAAATAAGGAACCTGTCCGGAAATTACATTTTGGATTTGATATACGGCCTAC 

ACA AGCA AATCAACAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGC 

ACCCTTTAGTrGTGACTGGACCCCTAGTGACGAGAGGATCAGATTGCCAACAG 

ACAACTATTTGAGGGTAGAAACCAGGTGCCCATGTGGCTATGAGATGAAAGCT 

TTCAAAAATGTAGGTGGCAAACTTACCAAAGTGGAGGAGAGCGGGCCTTTCCT 

ATGTAGAAACAGACCTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATT 

ACGATGACAACCTCAGAGAGATAAAACCAGTAGCAAAGTTGGAAGGACAGGTA 

GAGCACTACTACAAAGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAAAAT 

GCTCTTGGCCACTGACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAG 

CTAAGAGATATACTGGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCCC 

AATCACCGTGCTCTAGTGGAGAGGGACTGTGCAACTATAACCAAAAACACAGT 

ACAGTTTCTAAAAATGAAGAAGGGGTGTGCGTTCACCTATGACCrGACCATCTC 

CAATCTGACCAGGCTCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGG 

AAATACCCACCGCTACGGTCACCACATGGCTAGCTTACACCTTCGTGAATGAAG 
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ACGTAGGGACTATAAAACCAGTACTAGGAGAGAGAGTAATCCCCGACCCTGTA 

GTTGATATCAATTTACAACCAGAGGTGCAAGTGGACACGTCAGAGGTTGGGAT 

CACAATAATTGGAAGGGAAACCCTGATGACAACGGGAGTGACACCTGTCTTGG 

AAAAAGTAGAGCCTGACGCCAGCGACAACCAAAACTCGGTGAAGATCGGGTTG 

GATGAGGGTAATTACCCAGGGCCTGGAATACAGACACATACACTAACAGAAGA 

AATACACAACAGGGATGCGAGGCCCTTCATCATGATCCTGGGCTCAAGGAATT 

CCATATCAAATAGGGCAAAGACTGCTAGAAATATAAATCTGTACACAGGAAATG 

ACCCCAGGGAAATACGAGACTTGATGGCTGCAGGGCGCATGTTAGTAGT AGCA 

CTGAGGGATGTCGACCCTGAGCTGTCTGAAATGGTCGATTTCAAGGGGACTTT 

TTTAGATAGGGAGGCCCTGGAGGCTCTAAGTCTCGGGCAACCTAAACCGAAGC 

AGGTTACCAAGGAAGCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAG 

ATCCCTAACTGGTTTGCATCAGATGACCCAGTATTTCTGGAAGTGGCCTTAAAA 

AATGATAAGTACTACrTAGTAGGAGATGTTGGAGAGCTAAAAGATCAAGCTAAA 

GCACTTGGGGCCACGGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGGA 

CGTATGCCATGAAGCTATCTAGCTGGTTCCTCAAGGCATCAAACAAACAGATGA 

GTTTAACTCCACrGTTTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAGA 

GCAATAAGGGGCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAG 

CCCCTCGGTTGCGGGGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGAT 

ACACCCATATGAAGCTTACCTGAAGTTGAAAGATTTCATAGAAGAAGAAGAGAA 

GAAACCTAGGGTTAAGGATACAGTAATAAGAGAGCACAACAAATGGATACTTA 

AAAAAATAAGGTTTCAAGGAAACCTCAACACCAAGAAAATGCTCAACCCAGGG 

AAACTATCTGAACAGTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCA 

CCAGATTGGTACTATAATGTCAAGTGCA GGCAT AAGGCTGGAGAAATTGCCAA 

TAGTGAGGGCCCAAACCGACACCAAAACCTTTCATGAGGCAATAAGAGATAAG 

ATAGACAAGAGTGAAAACCGGCAAAATCCAGAATTGCACAACAAATTGTTGGA 

GATTTTCCACACGATAGGCCAACCCACCCTGAAACACACCTACGGTGAGGTGA 

CGTGGGAGCAACTTGAGGCGGGGGTAAATAGAAAGGGGGCAGCAGGCTTCCT 

GGAGAAGAAGAACATCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGAAC 

AATTGGTCAGGGATCTGAAGGCCGGGAGAAAGATAAAATATTATGAAACTGCA -* 

ATACCAAAAAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGACC rj 

TGGTGGTTGAGAAGAGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAAGG \Z 

CTAGCCATCACTAAGGTCATGTATAACTGGGTGAAACAGCAGCCCGTTGTGATT S 

CCAGGATATGAAGGAAAGACCCCCTTGTTCAACATCTTTGATAAAGTGAGAAAG 3 

GAATGGGACTCGTTCAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGCCTG y 

GGACACTCAAGTGACTAGTAAGGATCTGCAACTTATTGGAGAAATCCAGAAATA £ 

TTACTATAAGAAGGAGTGGCACAAGTTCATTGACACCATCACCGACCACATGAC 

AGAAGTACCAGTTATAACAGCAGATGGTGAAGTATATATAAGAAATGGGCAGA 

GAGGGAGCGGCCAGCCAGACACAAGTGCTGGCAACAGCATGTTAAATGTCCT 

GACAATGATGTACGGCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGTTTCA 

ACAGGGTGGCAAGGATCCACGTCTGTGGGGATGATGGCTTCTTAATAACTGAA 

AAAGGGTTAGGGCTGAAATTTGCTAACAAAGGGATGCAGATTCTTCATGAAGC 

AGGCAAACCTCAGAAGATAACGGAAGGGGAAAAGATGAAAGTTGCCTATAGAT 

TTGAGGATATAGAGTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCCGACA 

ACACCAGTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATG 

GCAACAAGATTGGATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAAAAGC 

GGTAGCCTTCAGTTTCTTGCTGATGTATTCCTGGAACCCGCTTGTTAGGAGGAT 

TTGCCTGTTGGTCCTTTCGCAACAGCCAGAGACAGACCCATCAAAACATGCCAC 

TTATTATTACAAAGGTGATCCAATAGGGGCCTATAAAGATGTAATAGGTCGGAA 

TCTAAGTGAACTGAAGAGAACAGGCTrTGAGAAATTGGCAAATCTAAACCTAAG 

CCTGTCCACGTTGGGGGTCTGGACTAAGCACACAAGCAAAAGAATAATrCAGG 

ACTGTGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAAGCCCGACAGG 

CTGATATCCAGCAAAACTGGCCACTTATACATACCTGATAAAGGCTTTACATTAC 

AAGGAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCGGTCATG 

GGGGTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCTGCTGAG 



BNSDOCID: <WO 995S366A1_L> 



WO 99/55366 




PCI7US99/08850 



54/67 _ . . . 

AAGGTTGAAAATTCTGCTCATGACGGCCGTCGGCGTCAGCAGCTGAgacaaaatgtat 
atattgtaaataaattaatccatgtacatagtgtatataaatatagttgggaccgtccacctcaagaagacgacacgcccaacacgcacag 
ctaaacagtagtcaagattatctacctcaagataacactacatttaatgcacacagcactttagctgtatgaggatacgcccgacgtctatag 
ttggactagggaagacctctaacagccccc 
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GtatacEagaattagaaaaggcactcgtatacgiattgggcaattaaaaataataattaggcctaggtacatggcacgtgccagccccct 
eatHHgggcBacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgag 

rgtcgtecagcctccaggaccccccctccxgggagagccamgtggtctgcgga^ 

cgggVccttfcttggataaacccgct^ 

AAGTTCCCGGGTX3GCGOT 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAG 
CCT^GAGGTAGACGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGG/i 
r^OnGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGTTGCGGG 

?g^§gLa1^ 

cIIgIc^o^^^^ 
gtSactc^a^^^ 

gSctatctatcccggccatataacgggtcatcc3catc . , 

?^^r^TCCCCTACGG^ a 

caa£cca?cat^ 3 

AGCGTAmCTCCATGGT^ % 

a^tcaaa^^aa^ 

AACTOTCAGGCTGTCCTGAGAGGTTGGCCAGCTGCCGACGCOTA^ 

a^gtctct^^ 

a^gSaaSSa^^^ 
^gc^tgctcg^ 

A(^ATATTCA^AGTCAGGATC 

cggcctgcaactg^^ 

GTOCGAGCTCAGCCCATTGCTGCTGTCCACCACACAGTGGCAGOT 

GTTC^ACGA^ 
ACAT^GGAGOT^ 

GrGA^AAGTGGGAGTACGTCGTTCTCCTGTTCCTCCT 
GTCTG^CTGOTGTGGATGATGTTACTCATATCCCAA 

Q^Q/SSCT^^rAATACTCAATGCAGCATCCCTGGCCGGGACGCACGGTCTTGT 
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GTCCTTCCTCGTGTTCTTCTGCTTTGCGTGGTATCTGAAGGGTAGGTGGGTGCC 

CGGAGCGGTCTACGCCTTCTACGGGAAGTGGGTCTTACTCTTATACCACATCTT 

AGTGGTACACCCAATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGT 

GGTAAAGGCCGATTCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCrCTGTT 

TTACAACAGTAGTACTAATCGTCATAGGTTTAATCATAGCTAGGCGTGACCCAA 

CTATAGTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACC 

CACCAGCCTGGAGTTGACAT CGCTG TGGCGGTCATGACTATAACCCTACTGAT 

GGTTAGCTATGTGACAGATTATTTTAGATATAAAAAATGGTTACAGTGCATTCTC 

AGCCTGGTATCTGCGGTGTTCTTGATAAGAAGCCTAATATACCTAGGTAGAATC 

GAGATGCCAGAGGTAACrATCCCAAACTGGAGACCACTAACTTTAATACTATTA 

TATTTGATCTCAACAACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCTA 

TTGTTGCAATGTGTGCCTATCTTATTGCTGGTCACAACCTTGTGGGCCGACTTCT 

TAACCCTAATACTGATCCTGCCTACCTATGAATTGGTTAAATTATACTATCrGAA 

AACTGTTAGGACTGATACAGAAAGAAGTTGGCTAGGGGGGATAGACTATACAA 

GAGTTGACTCCATCTACGACGTTGA TGAGAG TGGAGAGGGCGTATATCTTTTTC 

CATCAAGGCAGAAAGCACAGGGGAATTTTTCTATACTCTTGCCCCTTATCAAAG 

CAACACTGATAAGTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTTACT 

TAACTTTGGACnTATGTACTACATGCACAGGAAAGTTATAGAAGAGATCTCAG 

GAGGTACCAACATAATATCCAGGTTAGTGGCAGCACTCATAGAGCTGAACFGG 

TCCATGGAAGAAGAGGAGAGGAAAGGCTTAAAGAAGTTTTATCTATTGTCTGG 

AAGGTTGAGAAACCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCTT 

CTTGGTACGGGGAGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATC 

AAGGCCAGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGA 

GGGCCGAGAGTGGAAAGGTGGCACCTGC CCAA AATGTGGACGCCATGGGAAG 

CCGATAACGTGTGGGATGTCG CTAG CAGATTTTGAAGAAAGACACTATAAAAG 

AATCTTTATAAGGGAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAA 

AGCATAGGAGGTTTGAAATGGACCGGGAACCTAAGAGTGCCAGATACTGTGCT 

GAGTGTAAT AGGCTGCATCCTGCTGAGG AAGGTG AC ITT 1GGGC AG AGTCG AG 

CATGTTGGGCCTCAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGA 

TATCACAGAGTGGGCTGGATGCCAGCGTGTGGGAATCTCCCCAGATACCCACA 

GAGTCCCTTGTCACATCTCATTTGGTTCACGGATGCCTTTCAGGCAGGAATACA 

ATGGCITTGTACAATATACCGCTAGGGGGCAACTATTTCTGAGAAACTTGCCCG 

TACTGGCAACTAAAGTAAAAATGCTCATGGTAGGCAACCTTGGAGAAGAAATT 

GGTAATCTGGAACATCTTGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAGAA 

GATCACAGAGCACGAAAAATGCCACATTAATATACTGGATAAACTAACCGCATT 

TTTCGGGATCATGCCAAGGGGGACTACACCCAGAGCCCCGGTGAGGTTCCCTA 

CGAGCTTACTAAAAGTGAGGAGGGGTCTGGAGACTGCCTGGGCTTACACACAC 

CAAGGCGGGATAAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGT 

CTGTGACAGCATGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGA 

CCGATGAGACAGAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGC 

CAGATGTTATGTGTTAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGG 

CAGTCGTTCACCTCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCA 

GGCAC ACCGGC" nTCTI CGACCTAAAAAACTTG AAAGG ATGGTC AGGCTTGCCT 

ATATTTGAAGCCTCCAGCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGA 

ATGAAGAGTCTAAACCTACAAAAATAATGAGTGGAATCCAGACCGTCTCAAAAA 

ACAGAGCAGACCTGACCGAGATGGTCAAGAAGATAACCAGCATGAACAGGGG 

AGACTTCAAGCAGATTACTTTGGCAACAGGGGCAGGCAAAACCACAGAACTCC 

CAAAAGCAGTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATA 

CCATTAAG GGCAG CGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCC 

AAGCATCTCTTTTAACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAA 

CCGGGATAACCTATGCATCATACGGGTACTTCTGCCAAATGCCTCAACCAAAGC 

TCAGAGCTGCTATGGTAGAATACTCATACATATTCTTAGATGAATACCATTGTGC 

CACrCCTGAACAACTGGCAATTATCGGGAAGATCCACAGATTTTCAGAGAGTAT 

AAGGGTTGTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGT 
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Satcca^aaotc^gagttgtgacatcacaatccccctatgtaatcgtggc 

?££aAATGCTA^G^^ 

? ACG^TOAAA^GAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCA 
?CG^ACAGGC^AAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCG 

TaS^caSag^ 

^ACTGATCACCCAGAGCC^ 

A A A ATTACTCGTTTCTAAATGCCAGAAAGTTAGGGGAGG ATGTGCCCGTGTATA 

Tf^TACGCTACTGAAGAT^ 
OCTGATCCTGC^AACCAG 

S^AOCcScAAA^^ 

AAGAAGAAATAATGAGATATG^T ^ 

^ g ^^2aag^ctggggc atgaaac agcgtitgccacactagtotaaa 3 



nTGGCTAGCTTITGGAGGGGAATCAGTGTCAGACCACGTCAAGCAG^<^A 
^aSA^A^A^^AAAAOTGGAA £ 



aggXSota^atcaaggtct^^ 
^K^a£c^ag*a^™^^ 

TTCAAAAATGTAGGTGGCAAACTTACCAAAGTGGAGGAGAGCGG 

A^A^^TTCTAAAAATGAA^AAGGGGTGTGCGTTCACCTATGACCTGAC 
CA^TCTCACCAGGCTCATCG^ 
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AAATACCCACCGCTACGGTCACCACATGGCTAGCTTACACCTTCGTGAATGAAG 

ACGTAGGGACTATAAAACCAGTACTAGGAGAGAGAGTAATCCCCGACCCTGTA 

GTTGATATCAATTTACAACCAGAGGTGCAAGTGGACACGTCAGAGGTTGGGAT 

CACAATAATTGGAAGGGAAACCCTGATGACAACGGGAGTGACACCTGTCTTGG 

AAAAAGTAGAGCCTGACGCCAGCGACAACCAAAACTCGGTGAAGATCGGGTTG 

GATGAGGGTAATTACCCAGGGCCTGGAATACAGACACATACACTAACAGAAGA 

AATACACAACAGGGATGCGAGGCCCTTCATCATGATCCTGGGCTCAAGGAATT 

CCATATCAAATAGGGCAAAGACTGCTAGAAATATAAATCTGTACACAGGAAATG 

ACCCCAGGGAAATACGAGACTTGATGGCTGCAGGGCGCATGTTAGTAGTAGCA 

CTGAGGGATGTCGACCCTGAGCTGTCTGAAATGGTCGATTTCAAGGGGACTTT 

TTTAGATAGGGAGGCCCTGGAGGCTCTAAGTCTCGGGCAACCTAAACCGAAGC 

AGGTTACCAAGGAAGCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAG 

ATCCCTAACTGGTTTGCATCAGATGACCCAGTATTTCTGGAAGTGGCCTTAAAA 

AATGATAAGTACTACTTAGTAGGAGATGTTGGAGAGGTAAAAGATCAAGCTAA 

AGCACTTGGGGCCACGGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGG 

ACGTATGCCATGAAGCTATCTAGCTGGTTCCTCAAGGCATCAAACAAACAGATG 

AGTTTAACTCCACTGTTTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAG 

AGCAATAAGGGGCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGA 

GCCCCTCGGTTGCGGGGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAG 

ATACACCCATATGAAGCTTACCIX3AAGTTGAAAGATTTCATAGAAGAAGAAGAG 

AAGAAACCTAGGGTTAAGGATACAGTAATAAGAGAGCACAACAAATGGATACT 

TAAAAAAATAAGGTTTCAAGGAAACCTCAACACCAAGAAAATGCrCAACCCTGG 

GAAACTATCTGAACAGTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAAC 

CACCAGATTGGTACTATAATGTCAAGTGCAGGCATAAGGCTGGAGAAATTGCC 

AATAGTGAGGGCCCAAACCGACACCAAAAGCTTTCATGAGGCAATAAGAGATA 

AGAT AGAC AAGAGTGAAAACCGGCAAAATCCAGAATTGCACAACAAATTGTTG 

GAGATTTTCCACACGATAGCCCAACCCACCCTGAAACACACCTACGGTGAGGT 

GACGTGGGAGCAACTTGAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCTTC 

CTGGAGAAGAAGAACATCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGA 

ACAATTGGTCAGGGATCTGAAGGCCGGGAGAAAGATAAAATATTATGAAACTG 

CAATACCAAAAAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGA 

CCTGGTGGTTGAGAAGAGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAA 

GGCTAGCCATCACTAAGGTCATGTATAACTGGGTGAAACAGCAGCCCGTTGTG 

ATTCCAGGATATGAAGGAAAGACCCCCTTGTTCAACATCTTTGATAAAGTGAGA 

AAGGAATGGGACrCGTTCAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGC 

CTGGGACACTCAAGTGACTAGTAAGGATCTGCAACTTATTGGAGAAATCCAGA 

AATATTACTATAAGAAGGAGTGGCACAAGTTCATTGACACCATCACCGACCACA 

TGACAGAAGTACCAGTTATAACAGCAGATGGTGAAGTATATATAAGAAATGGG 

CAGAGAGGGAGCGGCCAGCCAGACACAAGTGCrGGCAACAGCATGTTAAATG 

TCCTGACAATGATGTACGCCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGT 

TTC AACAGGGTGGCAAGG ATCC ACGTCTGTGGGG ATG ATGGC1 " TC 1 " 1 AATAAC 

TGAAAAAGGGTTAGGGCTGAAATTTGCTAACAAAGGGATGCAGATTCTTCATG 

AAGCAGGCAAACCTCAGAAGATAACGGAAGGGGAAAAGATGAAAGTrGCCTAT 

AGATTTGAGGATATAGAGTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCC 

GACAACACCAGTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAA 

GATGGCAACAAGATTGGATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAA 

AAGCGGTAGCCTTCAGTTTCTTGCTGATGTATTCCTGGAACCCGCTTGTTAGGA 

GGATTTGCCTGTTGGTCCTTTCGCAACAGCCAGAGACAGACCCATCAAAACATG 

CCACTTATTATTACAAAGGTGATCCAATAGGGGCCTATAAAGATGTAATAGGTC 

GGAATCrAAGTGAACrGAAGAGAACAGGCTTTGAGAAATTGGCAAATCTAAAC 

CTAAGCCTGTCCACGTTGGGGATCTGGACTAAGCACACAAGCAAAAGAATAAT 

TCAGGACTGTGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAACGCCG 

ACAGGCTGATATCCAGCAAAACrGGCCACTTATACATACCTGATAAAGGCTTTA 

CATTACAAGGAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCG 
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GTCATGGGGGTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCT 
GCTGAGAAGGTTGAAAATTCTGCTCATGACGGCC 

TTA^TC^GCCGAA^GCCGCTTGGAATAAGGCCGGTGTGCGTTTrGTCT 

TTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGGCCG 

TmOTGACGAGCATTC 

GTCTTGTCGAATGTCGTC 

CAACGTCTCTAGCGACCCTITGCAGGCAGCGGAACCCCCCACCTGGCGA^ 

TGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTC^ 

ACCCCAGTGOCACGTTGTGAGTTGGATAGT^ 

CCTCAAGCGTA^CAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATT 
ATXJGGATX^GATCTGG 

GTTAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAA 
AACACGATGATAAGCTTGCCACAACcatgaccgagtacaagcccacggtgcgcctcgccacccgcgacga 



cetttcttgrctggccgcgcagca^ 

ctteggcgtetcic^ 
cEBteweccttcctggagacctccgcgccccgcaacctcccct^ 
Icrceaaelaccgcgcg^ctggtgcatgacccgcaagcccggtgccTGAcgcccgc^ 

r^A(^A^A^CTT^GCCACTTATACATACCTGATAAAGGCITTACATTACAAGGAA 
GGGACTGAGAC^ 

GAAAA^CTGCTCATGACGGCCGTCGGCGTCAGCAGCTGAgacaaaatgtatoUttgtaaata ^ 

aattaatccatgtacatagtgtttataaatatag^ 
J^attataacctol^ 
gaagacctctaacagccccc 



S 
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Gtatacgagaattagaaaaggcactcgtatacgtattgggcaattaaaaataamttaggcctaggtacatggcac 
Sfcttlttggatalac^^ 



^SCTAGATTCO^^^GACGAGGAAGACrrCCGAGCTCTCG^ 

SI^H^^^aI^^Sac 0 ^^ 

ArGGOTC^CGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTOTGGA 
fi^CGCTCCCA^CCCTGGCGCATGGCGTCCGGGTTCT 

^CTCTCTTGCCTGAC^GTGCCCGCTTCAGCCTACCAAGTGCGCAATTCCr 

S^ACCATGTC^ 
CGAraCATC^^ 



ACATTGTGGACGTCKTAGTACITGTACGGGGTAGG^ 



so 



GTCAACTCTTTACC^CTCrc 
GTF 

CAAGCCATCAT<MACATO B 
AGCGTAmCTO^^GGCra^ g 
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CGGAGCGGTCTACGCCTTCTACGGGATGTGGCCTCTCCTCCTGCTCCTGCTGG 

CGTTGCCTCAGCGGGCATACGCACTGGACACGGAGGTGGCCGCGTCGTGTGG 

CGGCGTTGTTCTTGTCGGGTTAATGGCGCTGAC TCTG TCGCCATATTACAAGCG 

CTACATCAGCTGGTGCATGTGGTGGCTTCAGTATTTTCTGACCAGAGTAGAAGC 

GCAACTGCACGTGTGGGTTCCCCCCCTCAACGTCCGGGGGGGGCGCGATGCC 

GTCATCTTACTCATGTGTGTTGT ACAC CCGACTCTGGTATTTGACATCACCAAAC 

TACTCCTGGCCATCTTCGGACCCCTTTGGATTCTTCAAGCCAGTTTGCTTAAAGT 

CCCCT ACTTCGTGCGCGTTCAAGGCCTTCTCCGGATCTGCGCGCTAGCGCGGA 

AGATAGCCGGAGGTCATTACGTGCAAATGGCCATCATCAAGTTAGGGGCGCTT 

ACTGGCACCTATGTGTATAACCATCTCACCCCTCTTCGAGACTGGGCGCACAAC 

GGCCTGCGAGATCTGGCCGTGGCTGTGGAACCAGTCGTCTTCTCCCGAATGGA 

GACCAAGCTCATCACGTGGGGGGCAGATACCGCCGCGTGCGGTGACATCATC 

AACGGCTTGCCCGTCTCTGCCCGTAGGGGCCAGGAGATACTGCTTGGGCCAGC 

CGACGGAATGGTCTCCAAGGGGTGGAGGTTGCTGGCGCCCATCACGGCGTAC 

GCCCAGCAGACGAGAGGCCTCCTAGGGTGTATAATCACCAGCCTGACTGGCCG 

GGACAAAAACCAAGTGGAGGGTGAGGTCCAGATCGTGTCAACTGCTACCCAAA 

CCTTCCTGGCAACGTGCATCAATGGGGTATGCTGGACTGTCTACCACGGGGCC 

GGAACGAGGACCATCGCATCACCCAAGGGTCCTGTCATCCAGATGTATACCAA 

TGTGGACCAAGACCTTGTGGGCTGGC CCGC TCCTCAAGGTTCCCGCTCATTGA 

CACCCTGCACCTGCGGCTCCTCGGACCTTTACCTGGTCACGAGGCACGCCGAT 

GTCATTCCCGTGCGCCGGCGAGGTGATAGCAGGGGTAGCCTGCTTTCGCCCCG 

GCCCATTTCCTACTTGAAAGGCTCCTCGGGGGGTCGGCTGTTGTGCCCCGCGG 

GACACGCCG TGGG CCTATTCAGGGCCGCGGTGTGCACCCGTGGAGTGGCTAA 

GGCGGTGGACTTTATCCCTGTGGAGAACCTAGAGACAACCATGAGATCCCCGG 

TGTTCACGGACAACTCCTCTCCACCAGCAGTGCCCCAGAGCTTCCAGGTGGCC 

CACCTGCATGCTCCCACCGGCAGCGGTAAGAGCACCAAGGTCCCGGCTGCGTA 

CGCA GCCC AGGGCTACAAGGTGTTGGTGCTCAACCCCTCTGTTGCTGCAACGC 

TGGGCTTTGGTGCTTACATGTCCAAGGCCCATGGGGTTGATCCTAATATCAGGA 

CCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACCTACGGC 

AAGTTCCTTGCCGACGGCGGGTGCTCAGGAGGTGCTTATGACATAATAATTTGT 

GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATCGGCACTGTCCT 

TGACCAAGCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACTGCTACC 

CCTCCGGGCTCCGTCACTGTGTCCCATCCTAACATCGAGGAGGTTGCTCTGTCC 

ACCACCGGAGAGATCCCCTTTTACGGCAAGGCTATCCCCCTCGAGGTGATCAA 

GGGGGGAAGACATCTCATCTTCTGCCACrCAAAGAAGAAGTGCGACGAGCTCG 

CCGCGAAGCTGGTCGCATTGGGCATCAATGCCGTGGCCTACTACCGCGGTCTT 

GACGTGTCTGTC ATCC CGACCAGCGGCGATGTTGTCGTCGTGTCGACCGATGC 

TCTCATGACTGGCTTTACCGGCGACTTCGACTCTGTGATAGACTGCAACACGTG 

TGTCACTCAGACAGTCGATTTCAGCCTTGACCCTACCnTACCATTGAGACAAC 

CACGCTCCCCCAGGATGCTGTCTCCAGGACTCAACGCCGGGGCAGGACTGGC 

AGGGGGAAGCCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCG 

GCATGTTCGACTCGTCCGTCCTCTGTGAGTGCTATGACGCGGGCTGTGCTTGG 

TATGAGCTCACGCCCGCCGAGACTACAGTTAGGCTACGAGCGTACATGAACAC 

CCCGGGGCTTCCCGTGTGCCAGGACCATCTTGAATTTTGGGAGGGCGTCTTTA 

CGGGCCTCACTCATATAGATGCCCACTTTCTATCCCAGACAAAGCAGAGTGGG 

GAGAACTTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCTAGGGCTCA 

AGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATCCGCCTTAAAC 

CCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAAT 

GAAGTCACCCTGACGCACCCAATCACCAAATACATCATGACATGCATGTCGGCC 

GACCTGGAGGTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTG 

CTCTGGCCGCGTATTGCCTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGATT 

GTCTTGTCCGGGAAGCCGGCAATTATACCTGACAGGGAGGTTCTCTACCAGGA 

GTTCGATGAGATGGAAGAGTGCTCTCAGCACTTACCGTACATCGAGCAAGGGA 

TGATGCTCGCTGAGCAGTTCAAGCAGAAGGCCCTCGGCCTCCTGCAGACCGCG 
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ACTGATAAGTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTTACTTAAC 

TTTGGACnTATGTACTACATGCACAGGAAAGTTATAGAAGAGATCTCAGGAGG 

TACCAACATAATATCCAGGTTAGTGGCAGCACTCATAGAGCTGAACTGGTCCAT 

GGAAGAAGAGGAGAGCAAAGGCTTAAAGAAGTTTTATCTATTGTCTGGAAGGT 

TGAGAAACCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCTTCTTGGT 

ACGGGGAGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATCAAGGCC 

AGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGAGGGCCG 

AGAGTGGAAAGGTGGCACCTGCCCAAAATGTGGACGCCATGGGAAGCCGATA 

ACGTGTGGGATGTCGCTAGCAGATTTTGAAGAAAGACACTATAAAAGAATCTTT 

ATAAGGGAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAAAGCATA 

GGAGGTTTGAAATGGACCGGGAACCTAAG AGTGC CAGATACTGTGCTGAGTGT 

AATAGGCTGCATCCTGCT GAGG AAGGTGACTTTTGGGCAGAGTCGAGCATGTT 

GGGCCTGAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGATATCAC 

AGAGTGGGCTGGATGCCAGCGTGTGGGA ATCTC CCCAGATACCCACAGAGTCC 

CTTGTCACATCTCATITGGTTCACGGATGCCTrTCAGGCAGGAATACAATGGCT 

TTGTACAATATACCXKTAGGGGGCAACTATTTCTGAGAAACTTGCCCGTACTGG 

CAACTAAAGTAAAAATGCTCATGGTAGGCAACCrTGGAGAAGAAATTGGTAATC 

TGGAACATCrTGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAG AAGAT CACA 

GAGCACGAAAAATGCCACATTAATATACTGGATAAACTAACCGCATTTTTCGGG 

ATCATGCCAAGGGGGACTACACCCAGAGCCCCGGTGAGGTTCCCTACGAGCTT 

ACTAAAAGTGAGGAGGGGTCTGGAGACTGGCTGGGCTTACACACACCAAGGC 

GGGATAAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGTCTGTGA 

cagcatgggacgaactagagtggtttgccaaagcaacaacaggttgaccgatg 
agacagagtatggcgtcaagactgactcagggtgcccagacggtgccagatg 
ttatgtgttaaatccagaggccgttaacatatcaggatccaaaggggcagtcgt 
tcacctccaaaagacaggtggagaattcacgtgtgtcaccgcatcaggcacac 
cggctttctrcgacctaaaaaacttgaaaggatggtcaggcttgcctatatttg t 
aagcctccagcgggagggtggttggcagagtcaaagtagggaagaatgaaga £ 
gtctaaacctacaaaaataatgagtggaatccagaccgtcrcaaaaaacacagc w 
agacctgaccgagatggtcaagaagataaccagcatgaacaggggagacttca 2 
agcagattacntggcaacaggggcaggcaaaaccacagaactcccaaaagca 9 

GTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATACCATTAAGG O 

GCAGCGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCCAAGCATCTC to 

TTTTAACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAACCGGGATA 

ACCTATGCATCATACGGGTACTTCTGCCAAATGCCTCAACCAAAGCTCAGAGCT 

GCTATGGTAGAATACTCATACATATTCTTAGATGAATACCATTGTGCCACrCCTG 

AACAACTGGCAATTATCGGGAAGATCCACAGATTTTCAGAGAGTATAAGGGTT 

GTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGTCAAAAGC 

ACCCAATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGGATCTTGGT 

AGTCAGTTCCTTGATATAGCAGGGTTAAAAATACCAGTGGATGAGATGAAAGG 

CAATATGTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGTAGCAAAGA 

AGCTAAAAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGAGGATCCA 

GCCAATCTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGCTACAAAT 

GCTATTGAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGACACGGG 

GTTGAAATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCATCGTAA 

CAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCGTAGGGG 

CAGAGTAGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAAACAGCA 

ACAGGGTCAAAGGACTACCACTATGACCrCTTGCAGGCACAAAGATACGGGAT 

TGAGGATGGAATCAACGTGACGAAATCCTTTAGGGAGATGAATTACGATTGGA 

GCCTATACGAGGAGGACAGCCTACTAATAACCCAGCTGGAAATACTAAATAATC 

TACTCATCTCAGAAGACTTGCCAGCCGCTGTTAAGAACATAATGGCCAGGACTG 

ATCACCCAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCAGGTCCCG 

GTCCTGTTCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACGAAAATTAC 

TCGTTTCTAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTATATCTACGCT 



BNSDOCID: <WO 9955366A1 _l_> 



____ , , PCT/IJSW08850 

WO 99/55366 



66/67 




GTTGGGGGTAGG 




BNSDOCID- <WO 9955366A1J_> 



WO 99/55366 




PCT/US99/08850 



67/67 



CACTGTTTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAGAGCAATAAG 

GGGCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAGCCCCTCGG 

TTGCGGGGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGATACACCCAT 

ATGAAGCTTACCTGAAGTTGAAAGATTTCATAGAAGAAGAAGAGAAGAAACCT 

AGGGTTAAGGATACAGTAATAAGAGAGCACAACAAATGGATACTTAAAAAAAT 

AAGGTTTCAAGGAAACCTCAACACCAAGAAAATGCTCAACCCTGGGAAACTATC 

TGAACAGTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCACCAGATT 

GGTACTATAATGTCAAGTGCAGGCATAAGGCTGGAGAAATTGCCAATAGTGAG 

GGCCCAAACCGACACCAAAACCTTTCATGAGGCAATAAGAGATAAGAT AGAC A 

AGAGTGAAAACCGGCAAAATCCAGAATTGCACAACAAATTGTTGGAGATTTTCC 

ACACGATAGCCCAACCCACCCTGAAACACACCTACGGTGAGGTGACGTGGGAG 

CAACTTGAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCTTCCTGGAGAAGA 

AGAACATCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGAACAATTGGTC 

AGGGATCTGAAGGCCGGGAGAAAGATAAAATATTATGAAACTGCAATACCAAA 

AAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGACCTGGTGGTT 

GAGAAGAGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAAGGCTAGCCAT 

CACTAAGGTCATGTATAACTGGGTGAAA CAGC AGCCCGTTGTGATTCCAGGAT 

ATGAAGGAAAGACCCCCTTGTTCAACATCrTTGATAAAGTGAGAAAGGAATGG 

GACTCGTTCAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGCCTGGGACAC 

TCAAGTGACTAGTAAGGATCTGCAACTTATTGGAGAAATCCAGAAATATTACTA 

TAAGAAGGAGTGGCACAAGTTCATTGACACCATCACCGACCACATGACAGAAG 

TACCAGTTATAACAGCAGATGGTGAAGTATATATAAGAAATGGGCAGAGAGGG 

AGCGGCCAGCCAGACACAAGTGCTGGCAACAGCATGTTAAATGTCCTGACAAT 

GATGTACGCCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGTTTCAACAGGG ^ 

TGGCAAGGATCCACGTCTGTGGGGATGATGGCTTCTTAATAACTGAAAAAGGG £ 

TTAGGGCTGAAATTTGCTAACAAAGGGATGCAGATTCTTCATGAAGCAGGCAA 

ACCTCAGAAGATAACGGAAGGGGAAAAGATGAAAGTTGCCTATAGATTTGAGG P 

ATATAGAGTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCCGACAACACCA p 

GTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATGGCAACA O 

AGATTGGATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAAAAGCGGTAG £ 

CCTTCAGTTTCTTGCTGATGTATTCCTGGAACCCGCrTGTTAGGAGGATTTGCCT 

GTTGGTCCTTTCGCAACAGCCAGAGACAGACCCATCAAAACATGCCACTTATTA 

TTACAAAGGTGATCCAATA GGGGC CTATAAAGATGTAATAGGTCGGAATCTAA 

GTGAACTGAAGAGAACAGGCTTTGAGAAATTGGCAAATCTAAACCTAAGCCTG 

TCCACGTTGGGGATCTGGACTAAGCACACAAGCAAAAGAATAATTCAGGACTG 

TGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAA CGCC GACAGGCTGA 

TATCCAGCAAAACTGGCCACTTATACATACCTGATAAAGGCTTTACATTACAAG 

GAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCGGTCATGGGG 

GTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCTGCTGAGAAG 

GTTGAAAATTCTGCTCATGACGGCCGTCGGCGTCAGCAGCrGAgacaaaatgtatatattgt 

? antaaanaatc rat^arjata^g tatataaat amgttgggaccgtccacctcaagaagacga(^cgcccaacacgcacagctaaac 

agtagtcaagattatctacctc^gataacactacatttaatgcac^ 

tagggaagacctctaacagccccc 
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